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Abstract. To ensure high quality of and trust in both metadata and 
data, their representation in RDF must satisfy certain criteria - speci¬ 
fied in terms of RDF constraints. From 2012 to 2015 together with other 
Linked Data community members and experts from the social, behav¬ 
ioral, and economic sciences {SBE), we developed diverse vocabularies 
to represent SBE metadata and rectangular data in RDF. 

The DDI-RDF Discovery Vocabulary (DDI-RDF) is designed to support 
the dissemination, management, and reuse of unit-record data, i.e., data 
about individuals, households, and businesses, collected in form of re¬ 
sponses to studies and archived for research purposes. The RDF Data 
Cube Vocabulary (QB) is a W3G recommendation for expressing data 
cubes, i.e. multi-dimensional aggregate data and its metadata. Physi¬ 
cal Data Description (PHDD) is a vocabulary to model data in rectan¬ 
gular format, i.e., tabular data. The data could either be represented 
in records with character-separated values {CSV) or fixed length. The 
Simple Knowledge Organization System (SKOS) is a vocabulary to build 
knowledge organization systems such as thesauri, classification schemes, 
and taxonomies. XKOS is a SKOS extension to describe formal statisti¬ 
cal classifications. 

In this paper, we describe RDF constraints to validate metadata on unit- 
record data (DDI-RDF), aggregated data (QB), thesauri (SKOS), and 
statistical classifications (XKOS) and to validate tabular data (PHDD) - 
all of them represented in RDF. We classified these constraints according 
to the severity of occurring constraint violations. This technical report 
is updated continuously as modifying, adding, and deleting constraints 
remains ongoing work. 

Keywords: RDF Validation, RDF Constraints, DDI-RDF Discovery 
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1 Introduction 

For constraint formulation and RDF data validation, several languages exist or 
are currently developed. Shape Expressions (ShEx), Resource Shapes (ReSh), De- 


scription Set Profiles (DSP), OWL 2, the SPARQL Inferencing Notation (SPIN), 
and SPARQL are the six most promising and widely used constraint languages. 
OWL 2 is used as a constraint language under the closed-world and unique 
name assumptions. The W3C currently develops SHACL, an RDF vocabulary 
for describing RDF graph structures. With its direct support of validation via 
SPARQL, SPIN is very popular and certainly plays an important role for future 
developments in this field. It is particularly interesting as a means to validate 
arbitrary constraint languages by mapping them to SPARQL [3]. Yet, there is no 
clear favorite and none of the languages is able to meet all requirements raised 
by data practitioners. Further research and development therefore is needed. 

In 2013, the W3C organized the RDF Validation Workshop!! where experts 
from industry, government, and academia discussed first use cases for constraint 
formulation and RDF data validation. In 2014, two working groups on RDF 
validation have been established to develop a language to express constraints 
on RDF data: the W3C RDF Data Shapes Working Grow^ (33 participants 
of 19 organizations) and the DCMI RDF Application Profiles Task Group^ (29 
people of 22 organizations) which among others bundles the requirements of 
data institutions of the cultural heritage sector and the social, behavioral, and 
economic (SEE) sciences and represents them in the W3C group. 

Within the DCMI task group, a collaboratively curated database of RDF 
validation requirement^ has been created which contains the findings of the 
working groups based on various case studies provided by data institutions [2]. 
It is publicly available and open for further contributions. The database connects 
requirements to use cases, case studies, and implementations and forms the basis 
of this paper. We distinguish 81 requirements to formulate constraints on RDF 
data; each of them corresponding to a constraint type. 

In this paper, we collected constraints for commonly used vocabularies in the 
SBE domain (see Section HD, either from the vocabularies themselves or from 
domain and data experts, in order to gain a better understanding about the role 
of certain requirements for data quality and to direct the further development 
of constraint languages. We let the experts classify the constraints according to 
the severity of their violation. 


2 Common Vocabularies in SBE Sciences 

We took all well-established and newly developed SBE vocabularies into account 
and defined constraints for three vocabularies commonly used in the SBE sciences 
which are briefly introduced in the following. We analyzed actual data according 
to constraint violations, as for these vocabularies large data sets are already 
published. 


http://www.w3.org/2012/12/rdf-val/ 


http://www.w3.org/2014/rds/charter 


http://wiki.dubhncore.org/index.php/RDF-Apphcation-ProHles 


Online available at: http://purl.org/net/rdf-validation 



SBE sciences require high-quality data for their empirical research. For more 
than a decade, members of the SBE community have been developing and using a 
metadata standard, composed of almost twelve hundred metadata fields, known 
as the Data Documentation Initiative (DDI),^ an XML format to disseminate, 
manage, and reuse data collected and archived for research [10]. In XML, the 
definition of schemas containing constraints and the validation of data according 
to these constraints is commonly used to ensure a certain level of data quality. 
With the rise of the Web of Data, data professionals and institutions are very 
interested in having their data be discovered and used by publishing their data 
directly in RDF or at least publish accurate metadata about their data to fa¬ 
cilitate data integration. Therefore, not only established vocabularies like SKOS 
are used; recently, members of the SBE and Linked Data community developed 
with the DDI-RDF Discovery Vocabulary (DDI-RDF^ a means to expose DDI 
metadata as Linked Data. 

The data most often used in research within SBE sciences is unit-record data, 
i.e., data collected about individuals, businesses, and households, in form of re¬ 
sponses to studies or taken from administrative registers such as hospital records, 
registers of births and deaths. A study represents the process by which a data 
set was generated or collected. The range of unit-record data is very broad - 
including census, education, health data and business, social, and labor force 
surveys. This type of research data is held within data archives or data libraries 
after it has been collected, so that it may be reused by future researchers. By its 
nature, unit-record data is highly confidential and access is often only permitted 
for qualified researchers who must apply for access. Researchers typically rep¬ 
resent their results as aggregated data in form of multi-dimensional tables with 
only a few columns: so-called variables such as sex or age. Aggregated data, 
which answers particular research questions, is derived from unit-record data 
by statistics on groups or aggregates such as frequencies and arithmetic means. 
The purpose of publicly available aggregated data is to get a first overview and 
to gain an interest in further analyses on the underlying unit-record data. For 
more detailed analyses, researchers refer to unit-record data including additional 
variables needed to answer subsequent research questions. 

Formal childcare is an example of an aggregated variable which captures 
the measured availability of childcare services in percent over the population in 
European Union member states by the dimensions year, duration, age of the 
child, and country. Variables are constructed out of values (of one or multiple 
datatypes) and/or code lists. The variable age, e.g., may be represented by values 
of the datatype xsdmonNegativeInteger or by a code list of age clusters (e.g., 
’0 to 10’ and ’ll to 20’). The RDF Data Cube Vocabulary (QB^ is a W3C 
recommendation for representing data cubes, i.e., multi-dimensional aggregated 
data, in RDF [3]. A qb:DataStructureDefinition contains metadata of the data 
collection. The variable/ormaZ childcare is modeled as qb:measure, since it stands 

^ http://www.ddialliance.org/Specification/ 

® http://rdf-vocabulary.ddialliance.org/discovery.html 
® http://www.w3.org/TR/vocab-data-cube/ 



for what has been measured in the data collection. Year, duration, age, and 
country are qb:dimensions. Data values, i.e., the availability of childcare services 
in percent over the population, are collected in a qb:DataSet. Each data value is 
represented inside a qb:0bserration which contains values for each dimension. 

For more detailed analyses we refer to the underlying unit-record data. The 
aggregated variable formal childcare is calculated on the basis of six unit-record 
variables (i.a.. Education at pre-school) for which detailed metadata is given (i.a., 
code lists) enabling researchers to replicate the results shown in aggregated data 
tables. DDI-RDF is used to represent metadata on unit-record data in RDF. The 
study {disco:Study) for which the unit-record data has been collected contains 
eight data sets {disco:LogicalDataSet) including variables {disco:Variable) like 
the six ones needed to calculate the variable formal childcare. 

The Simple Knowledge Organization System (SKOS) is reused to a large 
extend to build SBE vocabularies. The codes of the variable Education at pre¬ 
school are modeled as skos:Concepts and a skos:OrderedCollection organizes 
them in a particular order within a skos'.memberList. A variable may be asso¬ 
ciated with a theoretical concept {skos:Concept) and skos:narrower builds the 
hierarchy of theoretical concepts within a skos:ConceptScheme of a study. The 
variable Education at pre-school is assigned to the theoretical concept Child Care 
which is a narrower concept of the top concept Education. Controlled vocabular¬ 
ies {skos:ConceptScheme), serving as extension and reuse mechanism, organize 
types {skos'.Concept) of descriptive statistics {disco'.SummaryStatistics) like min¬ 
imum, maximum, and arithmetic mean. 

3 Classification of RDF Constraints according to the 
Severity of Constraint Violations 

A concrete constraint is instantiated from one of the 81 constraint types and 
is defined for a specific vocabulary. It does not make sense to determine the 
severity of constraint violations of an entire constraint type, as the severity de¬ 
pends on the individual context and vocabulary. SBE experts determined the 
default severity levels for each constraint to indicate how serious the violation 
of the constraint is. We use the classification system of log messages in software 
development like Apache Log4j 2 [T], the Java Logging ap/EI and the Apache 
Commons Logging apIB as many data practitioners also have experience in 
software development and software developers intuitively understand these lev¬ 
els. We simplify this commonly accepted classification system and distinguish 
the three severity levels (1) informational, (2) warning, and (3) error. Violations 
of informational constraints point to desirable but not necessary data improve¬ 
ments to achieve RDF representations which are ideal in terms of syntax and 
semantics of used vocabularies. Warnings are syntactic or semantic problems 

The possibility to define severity levels in vocabularies is in itself a requirement 

{R-158). 

http://docs.oracle.eom/javase/7/docs/api/java/util/logging/Level.html 
^^ http://commons.apache.org/proper/commons- logging/ 



which typically should not lead to an abortion of data processing. Errors, in 
contrast, are syntactic or semantic errors which should cause the abortion of 
data processing. Although we provide default severity levels for each constraint, 
validation environments should enable users to adapt the severity levels of con¬ 
straints according to their individual needs. 

4 RDF Constraints by RDF Constraint Type 

4.1 Subsumption 

A subclass axiorr^^ {concept inclusion in DL) states that the class Cl is a sub¬ 
class of the class C2 - Cl is more specific than C2, i.e. each resource of the class 
Cl must also be part of the class extension of C2. 

— DISCO-C-SUBSUMPTION-01: All disco:Universes must also be skosiConcepts 
(Universe E Concept). 

• severity level: ERROR 

4.2 Class Equivalence 

Class Equivalenc^^ asserts that two concepts have the same instances. While 
synonyms are an obvious example of equivalent concepts, in practice one more 
often uses concept equivalence to give a name to complex expressions [7] . Concept 
equivalence is indeed subsumption from left and right (A E i? and B ^ A implies 
A = B). 

— DISCO-C-CLASS-EQUIVALENCE-01: All sio:SI0.000367 resources 
must also be disco .'Variabl es ( Variable = SI0_000367). The Semanticscience 
Integrated Ontology (SlOjl^l provides a simple, integrated ontology of types 
and relations for rich description of objects, processes and their attributes. 
sio:SIO-000367 is a variable defined as a value that may change within the 
scope of a given problem or set of operations. Thus, sio:SIO-000367 is equiv¬ 
alent to disco'.Variable. 

• severity level: INFO 

4.3 Sub Properties 

Sub Provertie¥^ state that the property PI is a sub property of the property 
P2 - that is, if an individual x is connected by PI to an individual or a literal 
y, then x is also connected by P2 to y. 

— DISCO-C-SUB-PROPERTIES-01: If an individual x is connected by 
disco'.fundedBy to an individual y, then x is also connected by determs:contributor 
to y (fundedBy E contributor). 

• severity level: ERROR 

R-IOO-SUBSUMPTION 

R-3-EQUIVALENT-CLASSES 

https://code.google.com/p/semanticscience/wiki/SIO 
R-54-SUB-OBJECT-PROPERTIES, R-64-SUB-DATA-PROPERTIES 



4.4 Property Domains 


Property DomainW^ {domain restrictions on roles in DL) restrict the domain of 
object and data properties. The purpose is to declare that a given property is 
associated with a class. In 00 terms this is the declaration of a member, field, 
attribute or association. 3i?.T E C is the object property restriction where R is 
the object property (role) whose domain is restricted to concept C. 

— DISCO-C-PROPERTY-DOMAIN-01: Property Domain constiaints are 
defined for each Disco object and data property. Only disco:Questions, e.g., 

can have dzsco.'responseUomaOT relationships (3 responseDomain. T E Question). 

• Severity level: ERROR 

— DATA-CUBE-C-PROPERTY-DOMAIN-01: Property Domain con¬ 
straints are defined for each Data Cube object and data property. Only 
qb.'Observations, e.g., can have qbidataSet relationships (3 dataSet.T E 
Observation). 

• Severity level: ERROR 

— DCAT-C-PROPERTY-DOMAIN-01: Property Domain consiramts are 
defined for each DCAT object and data property. Only deat:Catalogs, e.g., 
can have dcat.'dataset relationships (3 dataset. T E Catalog). 

• Severity level: ERROR 

— PHDD-C-PROPERTY-DOMAIN-01: Property Domain coastraiats are 
defined for each PHDD object and data property. Only phdd:Tables, e.g., can 
have p/idd.dsS'trMctMredRj/relationships (3 isStructuredBy. T E Table). 

• Severity level: ERROR 

— SKOS-C-PROPERTY-DOMAIN-01: Property Domain coastraiats are 
defined for each SKOS object and data property. Only skos:ConceptSehemes, 

e.g., can have skos:hasTopConcept relationships (3 hasTopConcept. T E ConceptScheme). 

• Severity level: ERROR 

— XKOS-C-PROPERTY-DOMAIN-01: Property Domain coastraiats are 
defined for each XKOS object and data property. 

• Severity level: ERROR 


4.5 Property Ranges 

Property Range^^ {range restrictions on roles in DL) restrict the range of object 
and data properties. T E t/R.C is the range restriction to the object property R 
(restricted by the concept C). 

R-25-OBJECT-PROPERTY-DOMAIN, R-26-DATA-PROPERTY-DOMAIN 
R-28-OBJECT-PROPERTY-RANGE, R-35-DATA-PROPERTY-RANGE 



— DISCO-C-PROPERTY-RANGES-01: Property Range constraints are 
defined for each Disco object and data property, disco:caseQuantity relation¬ 
ships, e.g., can only point to literals of the datatype xsdmonNegativeInteger 
(T E V caseQuantity.nonNegativeInteger). 

• Severity level: ERROR 

— DATA-CUBE-C-PROPERTY-RANGES-01: Property Range coTLSiiahAs 
are defined for each Data Cube object and data property. qb:order rela¬ 
tionships, e.g., can only point to literals of the datatype xsd:string (T E V 
order.string). 

• Severity level: ERROR 

— DCAT-C-PROPERTY-RANGES-01: Property Range constraints are 
defined for each DC AT object and data property. dcat:hytes relationships, 

e.g., can only point to literals of the datatype xsd.'znte^er (T E V bytes . integer). 

• Severity level: ERROR 

— PHDD-C-PROPERTY-RANGES-01: Property Range constraints are 
defined for each PHDD object and data property, phdd:caseQuantity rela¬ 
tionships, e.g., can only point to literals of the datatype xsd:nonNegativeInteger 
(T E V caseQuantity.nonNegativeInteger). 

• Severity level: ERROR 

— SKOS-C-PROPERTY-RANGES-01: Property Range constraints are 
defined for each SKOS object and data property. 

• Severity level: ERROR 

— XKOS-C-PROPERTY-RANGES-01: Property Range constraints are 
defined for each XKOS object and data property. xkos:belongsTo relation¬ 
ships, e.g., can only point to instances of the class skos:Concept (T E V 
belongsTo.Concept). 

• Severity level: ERROR 

4.6 Inverse Object Properties 

In many cases, properties are used bi-directionally and then accessed in the 
inverse direction, e.g. parent = child“. There should be a way to declare value 
type, cardinality etc of those inverse relations without having to declare a new 
property URL The object property OPl is an invers^H of the object property 
OP2. Thus, if an individual x is connected by OPl to an individual j/, then y is 
also connected by OP2 to x, and vice versa. 

— DISCO-C-INVERSE-OBJECT-PROPERTIES-01: disco:CategoryStatistics 
resources are accessed from codes {skos:Concepts) via disco:statisticsCategory~. 

• severity level: ERROR 

— DISCO-C-INVERSE- OBJECT-PROPERTIES-02: disco: Summary Statistics 
resources are accessed from disco .'Variables via disco:statisticsVariable~. 

• severity level: ERROR 

— DISCO-C-INVERSE-OBJECT-PROPERTIES-03. disco:Variables are 
accessed from disco:Questions via disco:question~. 

• severity level: ERROR 
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4.7 Symmetric Object Properties 


A role is symmetric if it is equivalent to its own inverse [7]. An object property 
symmetry axionj^ states that the object property expression OPE is symmetric 
- that is, if an individual x is connected by OPE to an individual y, then y is 
also connected by OPE to x. 


4.8 Asymmetric Object Properties 

A property is asymmetric!^ if it is disjoint from its own inverse [7]. An object 
property asymmetry axiom states that the object property OP is asymmetric - 
that is, if an individual x is connected by OP to an individual y, then y cannot 
be connected by OP to x. 

DISCO-C-ASYMMETRIC-OBJECT-PROPERTIES-01: A disco:Variable 
may be based on a disco:RepresentedVariable. A disco:RepresentedVariable, 
however, cannot be based on a discoiVariable. This is a kind of mistake 
which may occur as a semantically equivalent object property for the other 
direction may also be possible {disco:basisOf) (basedOn n basedOn~ E T). 

• severity level: ERROR 


4.9 Reflexive Object Properties 

Reflexive Object Provertie^^ {reflexive roles, global reflexivity in DL) can be 
expressed by imposing local reflexivity on the top concept [7]. 


4.10 Irreflexive Object Properties 

An object property is irreflexivj^ {irreflexive role in DL) if it is never locally 
reflexive [7]. An object property irreflexivity axiom IrreflexiveObjectProperty( 
OPE ) states that the object property expression OPE is irreflexive - that is, no 
individual is connected by OPE to itself. 

- DISCO-C-IRREFLEXIVE-OBJECT-PROPERTIES-01: In Disco, 
every object property is irreflexive. No individual is connected by the ob¬ 
ject property mstritmenf to itself (T E -■ linstrument.Self). 

• severity level: ERROR 


R-61-SYMMETRIC-OBJECT-PROPERTIES 
R-62-ASYMMETRIC-OBJECT-PROPERTIES 
R-59-REFLEXIVE-OBJECT-PROPERTIES 
R-60-IRREFLEXIVE-OBJECT-PROPERTIES 



4.11 Class-Specific Irrefiexive Object Properties 

A property is irrefiexive if it is never locally reflexive [7]. An object property 
irreflexivity axiom states that the object property OP is irrefiexive - that is, no 
individual is connected by OP to itself. Class-Specific Irrefiexive Object Prop¬ 
erties are object properties which are irrefiexive within a given context, e.g. a 
class. 

- DISCO-C- CL A SS-SPECIFIC-IRREFLEXIVE- OBJECT-PROPERTIES- 

01: Within the Disco context, skos:Concepts cannot be related via the object 
property sfcos;6oader to themselves (Concept E “'3 broader. Self .). 

• severity level: ERROR 

- DISCO-C- CL A SS-SPECIFIC-IRREFLEXIVE- OBJECT-PROPERTIES- 

02: Within the Disco context, skos:Concepts cannot be related via the object 
property sfcos;narrower to themselves (Concept E “'3 narrower. Self .). 

• severity level: ERROR 


4.12 Disjoint Properties 

A disjoint properties axion^^ states that all of the properties are pairwise dis¬ 
joint; that is, no individual x can be connected to an individual/literal y by 
these properties. 

- DATA-CUBE-C-DISJOINT-PROPERTIES-01: All Data Cube prop¬ 
erties (not having the same domain and range classes) are defined to be 
pairwise disjoint. The properties qb:dataSet and qb:structure are disjoint 
{dataSet E -^structure). 

• severity level: ERROR 

- DCAT-C-DISJOINT-PROPERTIES-01: All DCAT properties (not 
having the same domain and range classes) are defined to be pairwise dis¬ 
joint. 

• severity level: ERROR 

- DISCO-C-DISJOINT-PROPERTIES-01: All Disco properties (not hav¬ 
ing the same domain and range classes) are defined to be pairwise disjoint. 

The properties disco-.variable and disco: question are disjoint (variable E 
^question). 

• severity level: ERROR 

- PHDD-C-DISJOINT-PROPERTIES-01: All PHDD properties (not 
having the same domain and range classes) are defined to be pairwise dis¬ 
joint. The properties phdd:isStructuredBy and phdd:column are disjoint (isStructuredBy 
^column) 

• severity level: ERROR 
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- SKOS-C-DISJOINT-PROPERTIES-01: All SKOS properties (not hav¬ 
ing the same domain and range classes) are defined to be pairwise disjoint. 

• severity level: ERROR 

- SKOS-C-DISJOINT-PROPERTIES-0^: Disjoint Labels Violation: 

Covers condition S13 from the SKOS reference document stating that ” skos:prefLabel, 
skos'.altLabel and skos:hiddenLabel are pairwise disjoint properties”. 

• Implementation: A SPARQL query collects all labels of all concepts, 
building an in-memory structure. This structure is then checked for dis¬ 
joint entries. 

• severity level: ERROR 

- XKOS-C-DISJOINT-PROPERTIES-01: All XKOS properties (not 
having the same domain and range classes) are defined to be pairwise dis¬ 
joint. 

• severity level: ERROR 


4.13 Disjoint Classes 

Disjoint Classe^^ state that all of the classes are pairwise disjoint; that is, no 
individual can be at the same time an instance of these disjoint classes. 

- DATA-CUBE-C-DISJOINT-CLASSES-01: All Data Cube classes are 
defined to be pairwise disjoint. 

• severity level: ERROR 

- DCAT-C-DISJOINT-CLASSES-01: All DCAT classes are defined to 
be pairwise disjoint. 

• severity level: ERROR 

- DISCO-C-DISJOINT-CLASSES-01: All Disco classes are defined to 
be pairwise disjoint (e.g. Study n Variable E T). 

• severity level: ERROR 

- PHDD-C-DISJOINT-CLASSES-01: All PHDD classes are defined to 
be pairwise disjoint. 

• severity level: ERROR 

- SKOS-C-DISJOINT-CLASSES-01: All SKOS classes are defined to be 
pairwise disjoint. 

• severity level: ERROR 

- XKOS-C-DISJOINT-CLASSES-01: All XKOS classes are defined to 
be pairwise disjoint. 

• severity level: ERROR 

Corresponds to qSKOS Quality Issues - SKOS Semi-Formal Consistency Issues - 

Disjoint Labels Violation 

R-7-DISJOINT-CLASSES 



4.14 Context-Specific Property Groups 


The Context-Specific Property Group^f\ constraint groups data and object prop¬ 
erties within a context (e.g. a class). 


4.15 Context-Specific Inclusive OR of Properties 

Inclusive or is a logical connective joining two or more predicates that yields the 
logical value ’’true” when at least one of the predicates is true. Context-Specific 
Inclusive OR of Propertie^^ constraints specify that individuals are valid if they 
have at least one property relationship of one or multiple properties stated within 
a given context. The context can be an application profile, a shape, or a class, 
i.e., the constraint applies for individuals of this specific class. 

4.16 Context-Specific Inclusive OR of Property Groups 

At least one property group must match for individuals of a specific context. 
Context may be a class, a shape, or an application profile. 


4.17 Recursive Queries 

Resource Shapes is a recursive languag^^ (the value shape of a Resource Shape 
is in turn another Resource Shape). There is no way to express that in SPARQL 
without hand-waving ’’and then you call the function again here” or ’’and then 
you embed this operation here” text. The embedding trick doesn’t work in the 
general case because SPARQL can’t express recursive queries, e.g. ’’test that this 
Issue is valid and all of the Issues that references, recursively”. Most SPARQL 
engines already have functions that go beyond the official SPARQL 1.1 spec. 
The cost of that sounds manageable. 


4.18 Individual Inequality 


An individual inequality axion^^ Differentlndividuals( ai ... a^ ) states that 
all of the individuals a^, 1 ^ i ^ n, are different from each other; that is, no 
individuals a^ and a^ with i ¥= j can be derived to be equal. This axiom can 
be used to axiomatize the unique name assumption — the assumption that all 
different individual names denote different individuals. 

R-66-PROPERTY-GROUPS 

R-202-CONTEXT-SPECIFIC-INCLUSIVE-OR-OE-PROPERTIES 

R-222-RECURSIVE-QUERIES 

R-lfiDISJOINT-INDIVIDUALS 



4.19 Equivalent Properties 


An equivalent object properties axiorr^^ Equivalent ObjectProperties( OPEi ... 

OPEn ) states that all of the object property expressions OPEi, 1 ^ i ^ n, 
are semantically equivalent to each other. This axiom allows one to use each 
OPEi as a synonym for each OPEj — that is, in any expression in the ontology 
containing such an axiom, OPEi can be replaced with OPEj without affecting the 
meaning of the ontology. The axiom EquivalentObjectProperties( OPEi OPE 2 ) 
is equivalent to the following two axioms SubObjectPropertyOf( OPEi OPE 2 ) 
and SubObjectPropertyOf( OPEi OPE 2 ). 

An equivalent data properties axionrn EquivalentDataProperties( DPEi ... 

DPEn ) states that all the data property expressions DPEi, 1 ^ i ^ n, are 
semantically equivalent to each other. This axiom allows one to use each DPEi as 
a synonym for each DPEj — that is, in any expression in the ontology containing 
such an axiom, DPE^ can be replaced with DPEj without affecting the meaning 
of the ontology. The axiom EquivalentDataProperties( DPEi DPE 2 ) can be 
seen as a syntactic shortcut for the following axiom SubDataPropertyOf( DPEi 
DPE 2 ) and SubDataPropertyOf( DPEi DPE 2 ). 

DATA-CUBE-C-EQUIVALENT-PROPERTIES-01: Equivalent prop¬ 
erties from different versions of Data Cube can be marked as equivalent. As a 
consequence, the properties can be replaced by each other without affecting 
the meaning. 

— DCAT-C-EQUIVALENT-PROPERTIES-01: Equivalent properties from 
different versions of DC AT can be marked as equivalent. As a consequence, 
the properties can be replaced by each other without affecting the meaning. 

DISCO-C-EQUIVALENT-PROPERTIES-01: Equivalent properties 
from different versions of Disco can be marked as equivalent, e.g. disco:containsVariable 
and disco'.variable. As a consequence, the properties can be replaced by each 
other without affecting the meaning. 

— PEtDD-C-EQUIVALENT-PROPERTIES-01 : Equivalent properties from 
different versions of PHDD can be marked as equivalent. As a consequence, 
the properties can be replaced by each other without affecting the meaning. 

— SKOS-C-EQUIVALENT-PROPERTIES-01: Equivalent properties from 
different versions of SKOS can be marked as equivalent. As a consequence, 
the properties can be replaced by each other without affecting the meaning. 

— XKOS-C-EQUIVALENT-PROPERTIES-01: Equivalent properties from 
different versions of XKOS can be marked as equivalent. As a consequence, 
the properties can be replaced by each other without affecting the meaning. 

R-4-EQUIVALENT-OBJECT-PROPERT1ES 

R-5-EQUIVALENT-DATA-PROPERTIES 



4.20 Property Assertions 

Property Assertion^^ and includes positive property assertions and negative 
property assertions. A positive object property assertion ObjectPropertyAsser- 
tion( OPE ai 02 ) states that the individual ai is connected by the object 
property expression OPE to the individual a. 2 - A negative objeet property asser¬ 
tion NegativeObjectPropertyAssertion( OPE ai 02 ) states that the individual ai 
is not connected by the object property expression OPE to the individual a 2 . A 
positive data property assertion DataPropertyAssertion( DPE a It ) states that 
the individual a is connected by the data property expression DPE to the literal 
It. A negative data property assertion NegativeDataPropertyAssertion( DPE a It 
) states that the individual a is not connected by the data property expression 
DPE to the literal It. 

4.21 Data Property Facets 

For datatype properties it should be possible to declare frequently needed facetJ^ 
to drive user interfaces and validate input against simple conditions, including 
min/max value, regular expressions, string length - similar to XSD datatypes. 
Constraining facets, to restrict datatypes of RDF literals, may be: xsddength, 
xsd:minLength, xsd:maxLength, xsd:pattern^ xsd:enumeration^ xsd:whiteSpaee, 
xsd'.maxlnclusive, xsd:maxExelusive, xsd:minExelusive, xsd:minlnclusive, xsd:total 
Digits, xsd:fractionDigits. 

- DISCO-C-DATA-PROPERTY-FACETS-01: The abstract of a series 
[determs'.abstract) should have a minimum length {xsd:minLength) of some 
determined minimum length X. 

• severity level: WARNING 

- DISCO-C-DATA-PROPERTY-FACETS-02: The abstract of a study 
{determs:abstract) should have a minimum length {xsd:minLength) of some 
determined minimum length X. 

• severity level: WARNING 


4.22 Literal Pattern Matching 

There are multiple use cases associated with the requirement to match literals 
according to given pattern^. 

- DISCO-C-LITERAL-PATTERN-MATCHING-01: Each disco:Variable 
of a given diseo:LogicalDataSet must have a given prefix for its variable name 
(sko s'.notation). 

• severity level: INFO 

R-96-PROPERTY-ASSERTIONS 
R-46-CONSTRAINING-FACETS 
R-44-PATTERN-MATCHING-ON-RDF-LITERALS 



4.23 Negative Literal Pattern Matching 

Literals of given data properties within given contexts do not have to match 
given patternQ 

- DISCO-C-NEGATIVE-LITERAL-PATTERN-MATCHING-01 : 

4.24 Object Property Paths 

Object Property Path^\ (or Object Property Chains and in DL terminology 
complex role inclusion axiom or role composition) is the more complex form 
of sub properties. This axiom states that, if an individual x is connected by a 
sequence of object property expressions OPEi, OPE^ with an individual y, 
then X is also connected with y by the object property expression OPE. Role 
composition can only appear on the left-hand side of complex role inclusions [7] . 


4.25 Intersection 

Concept inclusions allow us to state that all mothers are female and that all 
mothers are parents, but what we really mean is that mothers are exactly the 
female parents. DLs support such statements by allowing us to form complex 
concepts such as the intersectioT^^ (also called conjunction) which denotes the 
set of individuals that are both female and parents. A complex concept can 
be used in axioms in exactly the same way as an atomic concept, e.g., in the 
equivalence Mother = Female n Parent . 


4.26 Disjunction 

A union class expressior^^ contains all individuals that are instances of at least 
one class Ci for 1 ^ i ^ n. A union data range contains all tuples of literals 
that are contained in at least one data range DRi for 1 ^ i ^ n. Synonyms of 
disjunction are union and inclusive or. 

— DISCO-C-DISJUNCTION-01: Only disco: Variables or disco .-Questions 

or disco:RepresentedVariables can have disco.'concept relationships to skos:Concepts. 
Variable u Question u RepresentedVariable E V concept.Concept 
• severity level: ERROR 

R-U-PATTERN-MATCHING-ON-RDF-LITERALS 
R-55-OBJECT-PROPERTY-PATHS 

R-15-CONJUNCTION-OF-CLASS-EXPRESSIONS, R-16-CONJUNCTION-OF- 
DATA-RANGES 

R-17-DISJUNGTION-OF-GLASS-EXPRESSIONS, R-18-DISJUNGTION-OF- 
DATA-RANGES 



4.27 Negation 

A complement class exvressioT^l ObjectComplementOf( CE ) contains all indi¬ 
viduals that are not instances of the class expression CE. 

4.28 Existential Quantifications 

An existential class expressioT^^ {existential restriction in DL terminology) con¬ 
tains all those individuals that are connected by the property P to an individual 
X that is an instance of the class C or to literals that are in the data range DR. 

DISCO-C-EXISTENTIAL-QUANTIFICATIONS-01 : There must be 
at least one disco .'universe relationship from disco:Study Groups to disco:Universe 
(StudyGroup E 3 universe.Universe). 

• severity level: ERROR 

- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-02: There must be 
at least one disco .'universe relationship from disco:Studies to disco: Universe 
(Study E 3 universe.Universe). 

• severity level: ERROR 

- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-03: Theremaybe 
a disco.'universe relationship from disco:RepresentedVariable to disco:Universe 
(RepresentedVariable E 3 universe.Universe). 

• severity level: INFO 

DISCO-C-EXISTENTIAL-QUANTIFICATIONS-04 : There may be 
a disco .'universe relationship from disco:Variable to disco:Universe (Variable 
E 3 universe.Universe). 

• severity level: INFO 

- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-05: Theremaybe 
a disco .'universe relationship from disco'.Question to disco: Universe (Question 
E 3 universe.Universe). 

• severity level: INFO 

DISCO-C-EXISTENTIAL-QUANTIFICATIONS-06: There may be 
a disco:universe relationship from disco:LogicalDataSet to disco: Universe 
(LogicalDataSet E 3 universe.Universe). 

• severity level: INFO 

- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-07: There is no re¬ 
lationship {disco:ddifile) to a DDI-XML file containing further information 
about the series {disco'.StudyGroup) for further analyses. 

• severity level: INFO 

- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-08: There is no re¬ 
lationship {disco'.ddifile) to a DDI-XML file containing further information 
about the study {disco'.Study) for further analyses. 

• severity level: INFO 

R-19-NEGATION-OF-CLASS-EXPRESSIONS, R-20-NEGATION-OF-DATA- 
RANGES 

R-86-EXISTENTIAL-Q UANTIFIG A TION- ON-PROPERTIES 



DISCO-C-EXISTENTIAL-QUANTIFICATIONS-09: It is important 
to know the kind of data (disco:kindOfData) collected for a particular series 
(disco'.StudyGroup). Survey data, e.g., is much easier accessible than census 
data. For census data, it is necessary to get in contact with the individual 
data archive and if data access is granted it may take months to actually get 
the data. 

• severity level: INFO 

DISCO-C-EXISTENTIAL-QUANTIFICATIONS-10: It is important 
to know the kind of data (disco:kindOfData) collected for a particular study 
(disco:Study). Survey data, e.g., is much easier accessible than census data. 

For census data, it is necessary to get in contact with the individual data 
archive and if data access is granted it may take months to actually get the 
data. 

• severity level: INFO 

DISCO-C-EXISTENTIAL-Q UANTIFICATIONS-11 : Information about 
the temporal coverage (determs:temporal) of a series (disco.'StudyGroup) is 
of interest for particular queries (e.g. to search for all series of a given year 
(temporal coverage) and for which data is collected in which countries (spa¬ 
tial coverage) about which topics (topical coverage)). 

• severity level: INFO 

DISCO-C-EXISTENTIAL-Q UANTIFICATIONS-12: Information about 
the spatial coverage (determs:spatial) of a series (disco'.StudyGroup) is of in¬ 
terest for particular queries (e.g. to search for all series of a given year (tem¬ 
poral coverage) and for which data is collected in which countries (spatial 
coverage) about which topics (topical coverage)). 

• severity level: INFO 

DISCO-C-EXISTENTIAL-Q UANTIFICATIONS-13: Information about 
the topical coverage (determs.'subject) of a series (disco'.StudyGroup) is of in¬ 
terest for particular queries (e.g. to search for all series of a given year (tem¬ 
poral coverage) and for which data is collected in which countries (spatial 
coverage) about which topics (topical coverage)). 

• severity level: INFO 

DISCO-C-EXISTENTIAL-Q UANTIFICATIONS-I 4 : Information about 
the temporal coverage (determs:temporal) of a study (disco:Study) is of in¬ 
terest for particular queries (e.g. to search for all studies of a given year 
(temporal coverage) and for which data is collected in which countries (spa¬ 
tial coverage) about which topics (topical coverage)). 

• severity level: INFO 

DISCO-C-EXISTENTIAL-QUANTIFICATIONS-15: Information about 
the spatial coverage (determs'.spatial) of a study (disco:Study) is of interest 
for particular queries (e.g. to search for all studies of a given year (tem¬ 
poral coverage) and for which data is collected in which countries (spatial 
coverage) about which topics (topical coverage)). 

• severity level: INFO 



- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-16: Information about 
the topical coverage {determs’.subject) of a study {disco.’Study) is of interest 

for particular queries (e.g. to search for all studies of a given year (tem¬ 
poral coverage) and for which data is collected in which countries (spatial 
coverage) about which topics (topical coverage)). 

• severity level: INFO 

- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-17: Information about 
the temporal coverage {determstemporal) of a data set {disco’.LogicalDataSet) 

is of interest for particular queries (e.g. to search for all data sets of a given 
year (temporal coverage) and for which data is collected in which countries 
(spatial coverage) about which topics (topical coverage)). 

• severity level: INFO 

- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-18: Information about 
the spatial coverage {determs’.spatial) of a data set {disco’.LogicalDataSet) is 

of interest for particular queries (e.g. to search for all data sets of a given 
year (temporal coverage) and for which data is collected in which countries 
(spatial coverage) about which topics (topical coverage)). 

• severity level: INFO 

- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-19: Information about 
the topical coverage {determs.’subject) of a data set {disco’.LogicalDataSet) 

is of interest for particular queries (e.g. to search for all data sets of a given 
year (temporal coverage) and for which data is collected in which countries 
(spatial coverage) about which topics (topical coverage)). 

• severity level: INFO 

- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-20: Information about 
the temporal coverage {determs.’temporal) of a data file {disco:DataFile) is 

of interest for particular queries (e.g. to search for all data files of a given 
year (temporal coverage) and for which data is collected in which countries 
(spatial coverage) about which topics (topical coverage)). 

• severity level: INFO 

- DISCO-C-EXISTENTIAL-Q UANTIFICATIONS-21 : Information about 
the spatial coverage {determs.’spatial) of a data file {disco:DataFile) is of in¬ 
terest for particular queries (e.g. to search for all data files of a given year 
(temporal coverage) and for which data is collected in which countries (spa¬ 
tial coverage) about which topics (topical coverage)). 

• severity level: INFO 

- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-22: Information about 
the topical coverage {dcterms:subject) of a data file {disco:DataFile) is of in¬ 
terest for particular queries (e.g. to search for all data files of a given year 
(temporal coverage) and for which data is collected in which countries (spa¬ 
tial coverage) about which topics (topical coverage)). 

• severity level: INFO 

- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-23: Information about 
creators {determs:creator) (persons or organizations) of a series is important 
when searching for series of the same creators. 

• severity level: INFO 



- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-24: Information about 
creators {determs:creator) (persons or organizations) of a studies is impor¬ 
tant when searching for studies of the same creators. 

• severity level: INFO 

- DISCO-C-EXISTENTIAL-Q UANTIFICATIONS -25: If summary statis¬ 
tics are collected for studies, detailed further analyses are possible. 

• severity level: INFO 

- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-26: If category statis¬ 
tics are collected for studies, detailed further analyses are possible. 

• severity level: INFO 

DISCO-C-EXISTENTIAL-QUANTIFICATIONS-27: If a study has 
no associated data sets, the actual description of the data is missing. Even¬ 
tually, it is very hard or even impossible to get access to the data. 

• severity level: ERROR 

DISCO-C-EXISTENTIAL-QUANTIFICATIONS-28: If there is no 
data file for a given data set, the description of the data set and the con¬ 
taining study is not sufficient. 

• severity level: WARNING 

- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-29: The case quan¬ 
tity measures how many cases are collected for a study. High case quantity 
(disco:caseQuantity), stated for data files, is an indicator for high statistical 
quality of the underlying study. It indicates how comprehensive the study 
is. 

• severity level: WARNING 

DISCO-C-EXISTENTIAL-QUANTIFICATIONS-30: High variable 
quantity (disco:variableQuantity), stated for data files, is an indicator for 
high statistical quality of the underlying study. It indicates how comprehen¬ 
sive the study is. 

• severity level: WARNING 

- DISCO-C-EXISTENTIAL-QUANTIFICATIONS-31: High variable 
quantity (disco:variableQuantity), stated for data sets, is an indicator for 
high statistical quality of the underlying study. It indicates how comprehen¬ 
sive the study is. 

• severity level: WARNING 

DISCO-C-EXISTENTIAL-Q UANTIFICA TIONS -32: There is no sum¬ 
mary statistics type information {disco:summaryStatisticsType) for a sum¬ 
mary statistics resource. 

• severity level: ERROR 

DISCO-C-EXISTENTIAL-Q UANTIFICA TIONS -33: There is no sum¬ 
mary statistics value {rdf:value) for a summary statistics resource. 

• severity level: ERROR 

DISCO-C-EXISTENTIAL-QUANTIFICATIONS-34: There is no re¬ 
lationship to a variable {disco:statisticsVariable) for a summary statistics 
resource. 

• severity level: ERROR 



DISCO-C-EXISTENTIAL-Q UANTIFICATIONS-35: Category statis¬ 
tics resources must be related {disco.'statisticsCategory) to codes/categories 

• severity level: ERROR 

DISCO-C-EXISTENTIAL-Q UANTIFICATIONS-36: Category statis¬ 
tics resources must have at minimum one value for either frequency, percent¬ 
age, or cumulative percentage. 

• severity level: ERROR 

DISCO-C-EXISTENTIAL-QUANTIFICATIONS-37: Codes should 
be associated with categories (human-readable labels). 

• severity level: INFO 

DISCO-C-EXISTENTIAL-QUANTIFICATIONS-38: An instrument 
{discoinstrument) may have a link {disco:externalDocumentation) to the 
questionnaire. 

• severity level: INFO 

DISCO-C-EXISTENTIAL-Q UANTIFICATIONS-39: Questions ( disco:Question) 
must have question texts {disco:questionText). 

• severity level: ERROR 

DISCO-C-EXISTENTIAL-Q UANTIFICATIONS-4 0: Questions ( disco:Question) 
may have response domains {diseo:responseDomain). 

• severity level: INFO 

DISCO-C-EXISTENTIAL-Q UANTIFICA TIONS-41 : Questionnaires 
{disco:Questionnaire) may contain {disco:question) questions {disco:Question). 

• severity level: INFO 

DISCO-C-EXISTENTIAL-Q UANTIFICATIONS-42: Questions ( disco:Question) 
may have question numbers {skos:prefLabel). 

• severity level: INFO 

DISCO-C-EXISTENTIAL-QUANTIFICATIONS-43: Variables {disco: Variable) 
may have relationships {diseo.-question) to questions {disco:Question), as 
variables are created out of questions or calculated on the basis of other 
variables. 

• severity level: INFO 

DISCO-C-EXISTENTIAL-Q UANTIFICA TIONS-44 : Datasets {disco:LogiealDataSet) 
may have {disco:variable) variables {disco'.Variable). 

• severity level: INFO 

DISCO-C-EXISTENTIAL-Q UANTIFICA TIONS-4 5: Variables {disco: Variable) 
may have {disco.-concept) an associated theoretical concept {skos:Concept). 

• severity level: INFO 

DISCO-C-EXISTENTIAL-Q UANTIFICA TIONS-4 6: Each variable 
{disco.'Variable) should have {diseo-.representation) a variable representation 
{disco.'Representation) which is either an ordered code list {skos:OrderedColleetion), 
an unordered code list {skos:Conceptscheme) or a union of datatypes {rdfs:Datatype). 

• severity level: WARNING 


DATA-CUBE-C-EXISTENTIAL-QUANTIFICATIONS -OJ ; Dimen¬ 
sions have range {IC-4 0) - Every dimension declared in a qb:DataStructureDefinition 
must have a declared rdfs:range. 


• severity level: ERROR 

- DATA-CUBE-C-EXISTENTIAL-QUANTIFICATIONS-02: Concept 
dimensions have code lists {IC-5 [5]) - Every dimension with range skos:Concept 
must have a qb:codeList. 

• severity level: ERROR 

DATA-CUBE-C-EXISTENTIAL-QUANTIFICATIONS-03: DSD in¬ 
cludes measure {IC-3 [5]) - Every qbiDataStructureDefinition must include 
{qb:component, qb:componentProperty) at least one declared measure. 

• severity level: ERROR 

DATA-CUBE-C-EXISTENTIAL-QUANTIFICATIONS-04 : Slice 
Keys must be declared (7(7-7 [5]) - Every qb:SliceKey must be associated with 

{qb:sliceKey) a qb:DataStructureDefinition {SliceKey E 3 sliceKey” . DataStructureDef inition). 

• severity level: ERROR 


4.29 Universal Quantifications 

A universal class exvressioT^^ {value restriction in DL) contains all those indi¬ 
viduals that are connected by an object property only to individuals that are 
instances of a particular class. 

- DATA-CUBE-C-UNIVERSAL-QUANTIFICATIONS-01: Univer¬ 
sal quantifications are defined for each Data Cube object and data property. 

• Severity level: ERROR 

- DCAT-C-UNIVERSAL-QUANTIFICATIONS-01: Universal quan¬ 
tifications are defined for each DCAT object and data property. Only dcat:Catalogs 

can have dcot.'datoset relationships to dcat.'Datasets (Catalog E V dataset. Dataset). 

• Severity level: ERROR 

- DISCO-C-UNIVERSAL-QUANTIFICATIONS-01: Universal quan¬ 
tifications are defined for each Disco object and data property. Only disco:LogicalDataSets 
can have disco aggregation relationships to qb:DataSets (LogicalDataSet E 

V aggregation.DataSet). 

• Severity level: ERROR 

PHDD-C-UNIVERSAL-QUANTIFICATIONS-01: Universal quan¬ 
tifications are defined for each PHDD object and data property. 

• Severity level: ERROR 

- SKOS-C-UNIVERSAL-QUANTIFICATIONS-01: Universal quantifi¬ 
cations are defined for each SKOS object and data property. 

• Severity level: ERROR 

- XKOS-C-UNIVERSAL-QUANTIFICATIONS-01: Universal quan¬ 
tifications are defined for each XKOS object and data property. 

• Severity level: ERROR 
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4.30 Minimum Unqualified Cardinality Restrictions 

A minimum cardinality restrictio'i¥^ contains all those individuals that are con¬ 
nected by a property to at least n different individuals/literals that are instances 
of a particular class or data range. If the class is missing, it is taken to be 
owl'.Thing. If the data range is missing, it is taken to be rdfs:Literal. ^ nR.T 
is the minimum unqualified cardinality restriction where n e N (written ^ nR 
in short). For unqualified cardinality restrictions, classes respective data ranges 
are not stated. 

4.31 Minimum Qualified Cardinality Restrictions 

A minimum cardinality restrictioi^^ contains all those individuals that are con¬ 
nected by a property to at least n different individuals/literals that are instances 
of a particular class or data range. If the class is missing, it is taken to be 
owl'.Thing. If the data range is missing, it is taken to be rdfs:Literal. > nR.C is 
a minimum qualified cardinality restriction where n e N. 

- DATA-CUBE-C-MINIMUM-QUALIFIED-CARDINALITY-RESTRICTIONS 

01: Minimum Qualified Cardinality Restrictions constraints are defined for 
each Data Cube object and data property. 

• Severity level: ERROR 

- DATA-C UBE- C-MINIMUM- Q UALIFIED- CA RDINALIT Y-RES TRICTIONS 

02: Unique data set {IC-1 [5]) - Every qb:Observation has {qb:dataSet) ex¬ 
actly one associated g&.'UafaS'et (Observation c '^1 dataSet .DataSet n 
dataSet. DataSet). 

• Severity level: ERROR 

- DC AT- C-MINIMUM- Q UALIFIED-CARDINALITY-RESTRICTIONS- 

01: Minimum Qualified Cardinality Restrictions constraints are defined for 
each DCAT object and data property. 

• Severity level: ERROR 

- DISCO-C-MINIMUM-Q UALIFIED-CARDINALITY-RESTRICTIONS- 

01: Minimum Qualified Cardinality Restrictions constraints are defined for 
each Disco object and data property. A disco .'Questionnaire, e.g., has at 
least one disco:question relationship to disco:Questions (Questionnaire E 
question. Question). 

• Severity level: ERROR 

PHDD-C-MINIMUM-QUALIFIED-CARDINALITY-RESTRICTIONS- 

01: Minimum Qualified Cardinality Restrictions constraints are defined for 
each PHDD object and data property. 

• Severity level: ERROR 

R-81-MINIMUM-UNQUALIFIED-CARDINALITY-ON-PROPERTIES, R-211- 

CARDINALITY-CONSTRAINTS 

R-75-MIN1MUM-QUALIEIED-CARD1NAL1TY-ON-PROPERT1ES, R-211- 

CARDINALITY-CONSTRAINTS 



4.32 M^Lximum Unqualified Cardinality Restrictions 

A maximum cardinality restriction contains all those individuals that are con¬ 
nected by a property to at most n different individuals/literals that are instances 
of a particular class or data range. If the class is missing, it is taken to be 
owhThing. If the data range is not present, it is taken to be rdfs:Literal. Un¬ 
qualified means that the class respective the data range is not stated. ^ nR.T 
is a maximum unqualified cardinality restrictior^^ where n G N (written 5= nR 
in short). 

4.33 Maximum Qualified Cardinality Restrictions 

A maximum cardinality restriction contains all those individuals that are con¬ 
nected by a property to at most n different individuals/literals that are in¬ 
stances of a particular class or data range. If the class is missing, it is taken 
to be owhThing. If the data range is not present, it is taken to be rdfs:Literal. 

Qualified means that the class respective the data range is stated. ^ nR.C is a 
maximum qualified cardinality restrictioi^^ where n G N. 

DISCO-C-MAXIMUM- Q UALIFIED-CARDINALITY-RESTRICTIONS- 

01: A disco'.Variable has at most one disco:concept relationship to a theo¬ 
retical concept {skos:Concept) (Variable E concept. Concept). 

• Severity level: ERROR 

- DATA-CUBE-C-MAXIMUM-QUALIFIED-CARDINALITY-RESTRICTIONS- 

01: Unique data set (IC-1 [5]) - Every qb:Observation has [qb:dataSet) ex¬ 
actly one associated g&.'UotaS'et (Observation E dataSet.DataSet n 
dataSet. DataSet). 

• Severity level: ERROR 

4.34 Exact Unqualified Cardinality Restrictions 

An exact cardinality restrictioi^^ contains all those individuals that are con¬ 
nected by a property to exactly n different individuals that are instances of a 
particular class or data range. If the class is missing, it is taken to be owhThing. 

If the data range is not present, it is taken to be rdfs:Literal. Unqualified means 
that the class respective data range is not stated. ^ nR.T n ^ nR.T is an exact 
unqualified cardinality restriction where n G N. 

DATA-CUBE-C-EXACT-UNQUALIFIED-CARDINALITY-RESTRICTIONS- 

01: Unique slice structure {IC-9 [5]) - Each qb.Slice must have exactly one 
associated gbisliceStructure. 

• Severity level: ERROR 

R-82-MAXIMUM-UNQUALIFIED-CARDINALITY-ON-PROPERTIES, R-211- 
CARDINALITY-CONSTRAINTS 

R-76-MAXIMUM-QUALIEIED-CARDINALITY-ON-PROPERTIES, R-211- 

CARDINALITY-CONSTRAINTS 

R-80-EXACT-UNQUALIFIED-CARDINALITY-ON-PROPERTIES, R-211- 

CARDINALITY-CONSTRAINTS 



4.35 Exact Qualified Cardinality Restrictions 


An exact cardinality restrictioT^^ contains all those individuals that are con¬ 
nected by a property to exactly n different individuals that are instances of a 
particular class or data range. If the class is missing, it is taken to be owhThing. 
If the data range is not present, it is taken to be rdfs:Literal. > nR.Cn ^ nR.C 
is an exact qualified cardinality restriction where n G N. 


DISCO-C-EXA CT- Q UALIFIED-CARDINALITY-RESTRICTIONS- 

01: A disco;Question has exactly 1 disco.'wniferse relationship to disco:Universe 
(Question E universe.Universe n universe.Universe). 


• Severity level: ERROR 

DATA-CUBE-C-EXA CT- Q UALIFIED-CARDINALITY-RESTRICTIONS- 

02: Unique DSD {IC-2 [S]) - Every qb:DataSet has {qb:structure) exactly one 

associated qb:DataStructureDefinition (DataSet E structure.DataStructureDefinition 
n structure.DataStructureDefinition). 

• Severity level: ERROR 

4.36 Transitive Object Properties 

Transitivity is a special form of complex role inclusion. An object property tran¬ 
sitivity axiorr^^ states that the object property is transitive — that is, if an 
individual x is connected by the object property to an individual y that is con¬ 
nected by the object property to an individual z, then x is also connected by the 
object property to 2 . 

4.37 Context-Specific Exclusive OR of Properties 

Exclusive or is a logical operation that outputs true whenever both inputs differ 
(one is true, the other is false). Only one of multiple properties within some 
context (e.g. a class, a shape, or an application profile) leads to valid dateP^. This 
constraint is generally expressed in DL as follows: C E (^A n R) u (A n -^B). 

4.38 Context-Specific Exclusive OR of Property Groups 

Exclusive or is a logical operation that outputs true whenever both inputs differ 
tmnp is true the other is false). Only one of multiple property groups leads to 



R-74-EXACT-QUALIFIED-CARDINALITY-ON-PROPERTIES, 


R-211- 


CARDINALITY-CONSTRAINTS 

R-63-TRANSITIVE-OBJECT-PROPERTIES 
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DISCO-C-CONTEXT-SPECIFIC-EXCL USIVE-OR-OF-PROPERTY- 
GROUPS-01: Within the context of Disco, skos:Concepts can have either 
skos.’definition (when interpreted as theoretical concepts) or skos:notation 
and skos:prefLabel properties (when interpreted as codes and categories), 

Concept E (^ D n C) u (D n -■ C) 

D = A n B 

but not both. A E > 1 notation.string n ^ 1 notation.string 
B E ^ 1 prefLabel.string n ^ 1 prefLabel.string 

C E > 1 definition.string n ^ 1 definition.string 

• severity level: INFO 

4.39 Allowed Values 

It is a common requirement to narrow down the value space of a property by an 
exhaustive enumeration of the valid values (both literals or resources). This is 
often rendered in drop down boxes or radio buttons in user interfaces. Allowed 
value^^ for properties can be IRIs, IRIs (matching one or multiple patterns), 

(any) literals, literals of a list of allowed literals (e.g. ’red’ ’blue’ ’green’), typed 
literals of one or multiple type(s) (e.g. xsd:string). 

— DISCO-C-ALLOWED-VALUES-01. disco:CategoryStatistics can only 
have disco:computationBase relationships to the values valid and invalid of 

the datatype (CategoryStatistics = V computationBase . {valid, invali 

n langString). 

• severity level: ERROR 

4.40 Not Allowed Values 

A matching triple has any literal / object except those explicitly excludeci^. 

4.41 Literal Ranges 

PI is a data property (of an instance of class Cl) and its literal value must be 
between the range of [Vmin,Vmax^ 

— DISCO-C-LITERAL-RANGES-01: disco'.percentage (Aoramw. disco:CategoryStatistics) 
literals must be of the datatype xsd:double whose range should be restricted 

to be between 0 and 100. 

• severity level: ERROR 

— DISGO-G-LITERAL-RANGES-02: disco:cumulativePercentage (domain: 
disco:CategoryStatistics) literals must be of the datatype xsd:double whose 
range should be restricted to be between 0 and 100. 

• severity level: ERROR 

R-30-ALLOWED-VALUES-FOR-RDF-OBJECTS and R-37-ALLOWED-VALUES- 
FOR-RDF-LITERALS 

R-33-NEGATIVE-OBJECT-CONSTRAINTS, R-200-NEGATIVE-LITERAL- 
CONSTRAINTS 

R-45-RANGES-OF-RDF-LITERAL-VALUES 



4.42 Negative Literal Ranges 


PI is a data property (of an instance of class Cl ) and its literal value must not 
be between the range of [Vminymax^ 

4.43 Required Properties 

Properties may be requirecQ. 

4.44 Optional Properties 

Properties may be optional^. 

4.45 Repeatable Properties 

Properties may be reneatablj^. 

4.46 Negative Property Constraints 

Instances of a specific class must not have some object propertjl^. 


4.47 Individual Equality 

Individual eaualitiF^ states that two different names are known to refer to the 
same individual [7]. 


4.48 Functional Properties 

An object property functionality axion^^ FunctionalObiectProvertvf OPE j states 
that the object property expression OPE is functional — that is, for each indi¬ 
vidual a;, there can be at most one distinct individual y such that x is connected 
by OPE to y. Each such axiom can be seen as a syntactic shortcut for the 
following axiom: SubClassOff owLThing ObjectMaxCardinality( 1 OPE ) ). 

R-I 42 -NEGATIVE-RANGES-OF-RDF-LITERAL-VALUES 
R- 68 -REQUIRED-PROPERTIES 
R-69-OPTIONAL-PROPERTIES 
®® R-70-REPEATABLE-PROPERTIES 

®® R-52-NEGATIVE-OBJECT-PROPERTY-GONSTRAINTS, R-53-NEGATIVE- 
DATA-PROPERTY-GONSTRAINTS 
R- 6 -EQUIVALENT-INDIVIDUALS 
R-57-FUNGT10NAL-OBJEGT-PROPERTIES 



4.49 Inverse-Functional Properties 

An object property inverse functionality axion^^ InverseFunctionalObjectProp- 
erty( OPE ) states that the object property expression OPE is inverse-functional 
- that is, for each individual x, there can be at most one individual y such that 
y is connected by OPE with x. Each such axiom can be seen as a syntactic 
shortcut for the following axiom: SubClassOff owhThing ObjectMaxCardinality( 
1 ObjectInverseOf( OPE ) ) ). 

- DISCO-C-INVERSE-FUNCTIONAL-PROPERTIES-01: For each 
rdfs:Resource x, there can be at most one distinct rdfs.Resource y such that 
y is connected by admsidentifier to x (funct identifier^). 

• severity level: ERROR 

DISCO-C-INVERSE-FUNCTIONAL-PROPERTIES-02: Keys are 
even more general than inverse-functional properties, as a key can be a 
data, an object property, or a chain of properties [S]. For this generaliza¬ 
tion purposes, as there are different sorts of key, and as keys can lead to 
undecidability, DL is extended with key boxes and a special keyfor construct 
(identifier keyfor Resource) [8]. OWL 2 hasKey implements keyfor and 
thus can be used to identify resources uniquely, to merge resources with iden¬ 
tical key property values, and to recognize constraint violations. 

• severity level: ERROR 


4.50 Value Restrictions 

Individual Value Restriction^^ A has-value class expression ObjectHasValue( 
OPE a ) consists of an object property expression OPE and an individual a, 
and it contains all those individuals that are connected by OPE to a. Each 
such class expression can be seen as a syntactic shortcut for the class expression 
ObjectSomeValuesErom( OPE ObjectOneOf( a ) ). Literal Value Restrictions: A 
has-value class expression DataHasValue( DPE It ) consists of a data property 
expression DPE and a literal It, and it contains all those individuals that are 
connected by DPE to It. Each such class expression can be seen as a syntactic 
shortcut for the class expression DataSomeValuesErom( DPE DataOneOff It ) 
)■ 

4.51 Self Restrictions 

A self-restriction ObjectHasSelf( OPE ) consists of an object property expres¬ 
sion OPE, and it contains all those individuals that are connected by OPE to 
themselves. 

“ R-58-INVERSE-FUNCTIONAL-OBJECT-PROPERTIES 
“ R-88-VALUE-RESTRICTIONS 



4.52 Primary Key Properties 


The Primary Key Propertie^^ constraint is often useful to declare a given 
(datatype) property as the ’’primary key” of a class, so that a system can enforce 
uniqueness and also automatically build URIs from user input and data imported 
from relational databases or spreadsheets.. Starfleet officers, e.g., are uniquely 
identified by their command authorization code (e.g. to activate and cancel auto- 
destruct sequences). It means that the property commandAuthorizationCode is 
inverse functional - mapped to DL as follows: (f unct commandAuthorizationCode“) 
Keys, however, are even more general, i.e., a generalization of inverse functional 
properties [9]. A key can be a datatype property, an object property, or a chain 
of properties. For this generalization purposes, as there are different sorts of key, 
and as keys can lead to undecidability, DL is extended with key boxes and a spe¬ 
cial key for construct [5] . This leads to the following DL mapping (only one simple 
property constraint): commandAuthorizationCode keyfor StarfleetOfficer 

— see inverse-functional properties 


4.53 Class-Specific Property Range 

Class-Specific Property Rang^^ restricts the range of object and data properties 
for individuals within a specific context (e.g. class, shape, application profile). 

The values of each member property of a class may be limited by their value 
type, such as xsd:string or foaf:Person. 

- DISCO-C-CLASS-SPECIFIC-PROPERTY-RANGE-01: Only disco:Questions 
can have discoiquestionText relationships to literals of the datatype rdf:langString 
(—■Question E —'3 questionText. langString). 

• severity level: ERROR 

4.54 Class-Specific Reflexive Object Properties 

Using DL terminology Class-Specific Reflexive Object Properties is called local 
reflexivity - a set of individuals (of a specific class) that are related to themselves 
via a given role [7]. 


4.55 Membership in Controlled Vocabularies 

Resources can only be members of listed controlled vocabularie®. 
R-226-PRIMARY-KEY-PROPERTIES 

R-29-CLASS-SPECIFIC-RANGE-OF-RDF-OBJECTS, R-36-CLASS-SPECIFIC- 
RA NGE- OF-RDF-LITERA LS 

R-32-MEMBERSHIP-OF-RDF-OBJECTS-IN-CONTROLLED-VOCABULARIES, 

R-39-MEMBERSHIP-OF-RDF-LITERALS-IN-CONTROLLED- 

VOCABULARIES 



DISCO-C-MEMBERSHIP-IN- CONTR OLLED- VO CAB UL ARIES- 

01 : disco'.Summary Statistics can only have disco: summary StatisticType re¬ 
lationships to skos: Concepts which must be members of the controlled vo¬ 
cabulary ddicv: Summary StatisticType which is a skos:ConceptScheme. 
SummaryStatistics E \/summary StatisticType. A 
A = Concept n \/inScheme.B 
B = ConceptScheme n {SummaryStatisticType} 

• severity level: ERROR 

- DATA-CUBE-C-MEMBERSHIP-IN-CONTROLLED-VOCABULARIES- 

01: Codes from code list {IC-19 [S]) - If a dimension property has a qb:codeList, 
then the value of the dimension property on every qb:Observation must be 
in the code list. 

• severity level: ERROR 


4.56 IRI Pattern Matching 

IRI pattern matching applied on subjects, properties, and object^. 

— DISCO-C-IRI-PATTERN-MATCHING-01 : disco:Study resources must 
match a given IRI pattern. 

• severity level: INFO 


4.57 Literal Value Comparison 

Depending on the property semantics, there are cases where two different literal 
values must have a specific ordering with respect to an operator. PI and P2 are 
the datatype properties we need to compare and OP is the comparison operator 
(<, <=, >, >=, =, !=0. The COMP Pattern, one of the Data Quality Test 
Patterns, can be used to validate the Literal Value Comparison constraint [B]: 


SELECT ?s WHERE { 

?s •/.•/.? 1'/.'/. ?vl . 

?s •/.•/.P2n ?v2 . 

FILTER ( ?vl •/.'/.OPZ'/. ?v2 ) } 


— DISCO-C-LITERAL-VALUE-COMPARISON-01 : disco:startDates must 
be before (‘<’) disco:endDates. To validate this constraint we bind the vari¬ 
ables as follows (PI: disco:startDate, P2: disco:endDate, OP: <). 

• severity level: ERROR 

R-21-IRI-PATTERN-MATCHING-ON-RDF-SUBJECTS, R-22-IRI-PATTERN- 
MA TCHING- ON-RDF- OBJEGTS, R-23-IRI-PA TTERN-MA TGHING- ON-RDF- 
PROPERTIES 

R-43-LITERAL-VALUE-GOMPARISON 




4.58 Ordering 


With this constraint objects of object properties can be ordered as well as literals 
of data propertied- 

In DDI, variables, questions, and codes/categories are typically organized in 
a particular order. For obtaining this order, skos:OrderedCollection resources are 
used. 

— DISCO-C-ORDERING-01 : If disco'.Variables of a given disco :LogicalDataSet 
should be ordered, a collection of variables must be present in the data 
and connected with the data set. The collection of variables is of the type 
skos'.OrderedCollection containing multiple variables (each represented as 
skos:Concept) in a skos:memherList. 

• severity level: INFO 

— DISCO-C-ORDERING-02'. If disco:Questions of agiven disco:Questionnaire 
should be ordered, a collection of questions must be present in the data and 
connected with the questionnaire. The collection of questions is of the type 
skos'.OrderedCollection containing multiple questions (each represented as 
skos.'Concept) in a skos'.memberList. 

• severity level: INFO 

— DISCO-C-ORDERING-03 : If codes/categories ( skos:Concepts) of a given 
disco:Representation of a given disco: Variable should be ordered, the vari¬ 
able representation should also be of the type skos'.OrderedCollection con¬ 
taining multiple codes/categories (each represented as skos.'Concept) in a 
skos'.memberList. 

• severity level: INFO 


4.59 Validation Levels 

Different levels of severity (priorityshould be assigned to constraints. Possi¬ 
ble validation levels could be: informational, warning, error, fail, should, recom¬ 
mended, must, may, optional, closed (only this) constraints, open (at least this) 
constraint. 

For Disco each constraint should be assigned to exactly one validation level. 


4.60 String Operations 

Many different string overationj^ are possible. Some constraints require building 
new strings out of other strings. Calculating the string length would also be 
another constraint of this type. 

R-121-SPECIFY-ORDER-OF-RDF-RESOURCES, R-217-DEFINE-ORDER-FOR- 
FORMS/DISPLAY 

™ R-205-VARYING-LEVELS-OF-ERROR, R-135-CONSTRAINT-LEVELS, R-158- 
SEVERITY-LEVELS-OF-CONSTRAINT- VIOLATIONS, R-193-MULTIPLE- 
CONSTRAINT-VALIDATION-EXECUTION-LEVELS 
R-194-PROVIDE-STRING-FUNCTIONS-FOR-RDF-LITERALS 



— DISCO-C-STRING-OPERATIONS-01 : The title of a study [determs: title) 
(e.g. ’EU-SILC 2005’) may be calculated out of the title of the containing 
series [determs:title) [e.g. ’EU-SILC’) and the human-readable label of the 
study [rdfs.dabel) (e.g. ’2005’). 

• severity level: INFO 


4.61 Context-Specific Valid Classes 

What types are valid in a specific contextContext can be an input stream, a 
data creation function, or an API. 

- DATA-CUBE-C-CONTEXT-SPECIFIC-VALID-CLASSES-01:¥oy 

future versions of Data Cube, out-dated classes can be marked as deprecated. 

• severity level: INFO 

DCAT-C-CONTEXT-SPECIEIC-VALID-CLASSES-01: For future 
versions of DCAT, out-dated classes can be marked as deprecated. 

• severity level: INFO 

DISCO-C-CONTEXT-SPECIFIC- VALID-CLASSES-01 : For future 
versions of Diseo, out-dated classes can be marked as deprecated. 

• severity level: INFO 

PHDD-C-CONTEXT-SPECIFIC-VALID-CLASSES-01 : For future 
versions of PHDD, out-dated classes can be marked as deprecated. 

• severity level: INFO 

- SKOS-C-CONTEXT-SPECIFIC-VALID-CLASSES-01. For future 
versions of SKOS, out-dated classes can be marked as deprecated. 

• severity level: INFO 

- XKOS-C-CONTEXT-SPECIFIC-VALID-CLASSES-01: For future 
versions of XKOS, out-dated classes can be marked as deprecated. 

• severity level: INFO 


4.62 Context-Specific Valid Properties 

What properties can be used within this contextilll Context can be an data 
receipt function, data creation function, or API. 

- DATA-CUBE-C-CONTEXT-SPECIFIC-VALID-PROPERTIES-01: 

For future versions of Data Cube, out-dated properties can be marked as dep¬ 
recated. 

• severity level: INFO 

R-209-VALID-CLASSES 

R-210-VALID-PROPERTIES 



DCAT-C-CONTEXT-SPECIFIC- VALID-PROPERTIES-01 : For fu¬ 
ture versions of DC AT, out-dated properties can be marked as deprecated. 

• severity level: INFO 

- DISCO-C-CONTEXT-SPECIFIC-VALID-PROPERTIES-01: For 

future versions of Disco, out-dated properties can be marked as deprecated. 

• severity level: INFO 

- PHDD-C-CONTEXT-SPECIFIC-VALID-PROPERTIES-01: For in- 

ture versions of PHDD, out-dated properties can be marked as deprecated. 

• severity level: INFO 

SKOS-C-CONTEXT-SPECIFIC- VALID-PROPERTIES-01: For fu¬ 
ture versions of SKOS, out-dated properties can be marked as deprecated. 

• severity level: INFO 

XKOS-C-CONTEXT-SPECIFIC-VALID-PROPERTIES-01: For iu- 

ture versions of XKOS, out-dated properties can be marked as deprecated. 

• severity level: INFO 


4.63 Default Values 

Default value^^ for objects and literals are inferred automatically. It should be 
possible to declare the default value for a given property, e.g. so that input forms 
can be pre-populated and to insert a required property that is missing in a web 
service call. 

- DISCO-C-DEFAULT-VALUES-01: The value ’true’ for the property 
discodsPublic {xsdiboolean) indicates that the data set {disco:LogicalDataSet) 
can be accessed (usually downloaded) by anyone. Per default, access to data 
sets should be restricted (’false’). 

• severity level: INFO 


4.64 Mathematical Operations 

Examples for Mathematical Operation^^ are the addition of two dates, the ad¬ 
dition of days to a start date, and statistical computations (e.g. average, mean, 
sum). 

— DISCO-C-MATHEMATICAL-OPERATIONS-01: The sum of disco:percentage 
(datatype: xsd:double) values of all codes (represented as skos:Concepts) of 
a code list {skos:ConceptScheme or skos:OrderedCollection) , serving as rep¬ 
resentation of a particular discoiVariable, must exactly be 100. 

• severity level: ERROR 

R-31-DEFAULT-VALUES-OF-RDF-OBJECTS, R-38-DEFAULT-VALUES-OF- 

RDF-LITERALS 

R-42-MATHEMATICAL-OPERATIONS, R-fl-STATISTICAL-COMPUTATIONS 



DISCO-C-MATHEMATICAL-OPERATIONS-02: For a given vari¬ 
able, the sum of the frequencies of all codes of the variable’s code list has to 
be equal to the variable’s total number of cases (summary statistics of the 
type ’number of cases’). 

• severity level: ERROR 

DISCO-C-MATHEMATICAL-OPERATIONS-03: For a given vari¬ 
able, the sum of ’valid cases’ and ’invalid cases’ has to be equal to the total 
’number of cases’. 

• severity level: ERROR 

- DISCO-C-MATHEMATICAL-OPERATIONS-04 : For a given vari¬ 
able, the total ’number of cases’ value for the country ’All’ must be equal to 
the sum of the total ’number of cases’ value for each country. 

• severity level: ERROR 

DISCO-C-MATHEMATICAL-OPERATIONS-05: Minimum values 
do not have to be greater than maximum values (disco: Summary Statistics). 

• severity level: ERROR 


4.65 Language Tag Matching 

For particular data properties, values must be stated for predefined language^. 

- DISCO-C-LANGUA GE-TA G-MATCHING-01: There must be an En¬ 
glish variable name (skos'.notation) for each disco'.Variable within disco:LogicalDataSets. 
• severity level: INFO 


4.66 Language Tag Cardinality 

For particular data properti^ values of predefined languages must be stated for 
determined number of timeqlj. 

- DISGO-G-LANGUAGE-TAG-GARDINALITY-01: There must be 

at least one English disco:questionText for each disco:Question within disco:LogicalDataSets. 

• severity level: INFO 

DISCO-C-LANGUAGE-TAG-CARDINALITY-02: There should be 
at most one English literal value for variable names (skos:notation, domain: 
disco: Variable). 

• severity level: INFO 

- DISCO-C-LANGUAGE-TAG-GARDINALITY-03. For each ques¬ 
tion {disco:Question), there must be at least one question text {disco:questionText) 
associated with a language tag of an arbitrary language or with an English 
language tag. 

• severity level: INFO 

R-47-LANGUAGE-TAG-MATCHING 

R-49-RDF-LITERALS-HAVING-AT-MOST-ONE-LANGUAGE-TAG, R-48- 
MISSING-LA NG UA GE- TA GS 



- SKOS-C-LANGUAGE-TAG-CARDINALITY-0^: Omitted or In¬ 
valid Language Tags: Some controlled vocabularies contain literals in natural 
language, but without information what language has actually been used. 
Language tags might also not conform to language standards, such as RFC 
3066. 

• Implementation: Iteration over all triples in the vocabulary that have a 
predicate which is a (subclass of) rdfsdabel or skos:note. 

• Severity level: WARNING 

- SKOS-G-LANG UAGE-TAG-GARDINALITY-0^: Incomplete Lan¬ 
guage Coverage: Some concepts in a thesaurus are labeled in only one lan¬ 
guage, some in multiple languages. It may be desirable to have each concept 
labeled in each of the languages that also are used on the other concepts. 
This is not always possible, but incompleteness of language coverage for some 
concepts can indicate shortcomings of the vocabulary. 

• Severity level: INFO 

- SKOS- G-LANG UA GE- TA G-GARDINALITY-0^: No Common Lan¬ 
guage: Checks if all concepts have at least one common language, i.e. they 
have assigned at least one literal in the same language. 

• Severity level: INFO 

- SKOS-G-LANGUAGE-TAG-GARDINALITY-0^; Inconsistent Pre¬ 
ferred Labels: According to the SKOS reference document, ”A resource has 
no more than one value of skos:prefLabel per language tag”. 

• Implementation: A SPARQL query is used to find concepts with at least 
two prefLabels. In a second step, the language tags of these prefLabels 
are analyzed and an ambiguity is detected if they are equal. 

• Severity level: INFO 


4.67 Whitespace Handling 

Avoid whitespaces in literals neither leading nor trailing white snaceJ^. 

- DISGO-G- WHITESPA GE-HANDLING-01: Delete whitespaces of se¬ 
ries and study abstracts {determs:abstract] domain: disco:StudyGroup, disco:Study) 
automatically. 

• severity level: INFO 

Corresponds to qSKOS Quality Issues - Labeling and Documentation Issues - Omit¬ 
ted or Invalid Language Tags 

Corresponds to qSKOS Quality Issues - Labeling and Documentation Issues - In¬ 
complete Language Coverage 

Corresponds to qSKOS Quality Issues - Labeling and Documentation Issues - No 
Common Language 

Corresponds to qSKOS Quality Issues - SKOS Semi-Formal Consistency Issues - 
Inconsistent Preferred Labels 

R-50- WHITESPACE-HANDLING-OF-RDF-LITERALS 



4.68 HTML Handling 


Check if all HTML tags, included in literals (of specific data properties within 
the context of specific classes are closed properly. 

- DISCO-C-HTML-HANDLING-01: Check if all HTML tags, included 
in literals of all Disco data properties, are closed properly. 

• severity level: INFO 

- DISCO-C-HTML-HANDLING-02: Check if all HTML tags, included 
in literals of all data properties whose domains are Disco classes, are closed 
properly. 

• severity level: INFO 


4.69 Conditional Properties 

If specific properties exist, then specific other properties must also be presently. 

DISGO-G-GONDITIONAL-PROPERTIES-01 : If a sfcos. Concept rep¬ 
resents a code (having a skosinotation property) and a category (having a 
skos:prefLabel property), then the property disco:isValid has to be stated 
indicating if the code is valid (’true’) or missing (’false’). 

• severity level: ERROR 

— DISGO-G-GONDITIONAL-PROPERTIES-02: In order to get an overview 
over a series or a study either an abstract, a title, an alternative title, or links 

to external descriptions should be stated. If the abstract (determs:abstract) 
of a series [disco: Study Group) and an external description of the series 
[disco.'ddifile) is missing, a series title [determs’.title) or an alternative se¬ 
ries title [determs:alternative) has to be stated. 

• severity level: WARNING 

— DISGO-G-GONDITIONAL-PROPERTIES-03: In order to get an overview 
over a series or a study either an abstract, a title, an alternative title, or links 

to external descriptions should be stated. If the abstract [determs:abstract) 
of a study [disco’.Study) and an external description of the study [disco:ddifile) 
is missing, a study title [determs’.title) or an alternative study title [dc- 
terms:alternative) has to be stated. 

• severity level: WARNING 

DISGO-G-GONDITIONAL-PROPERTIES-04: If the abstract [dc- 
terms:abstract) of a series [disco’.StudyGroup), an external description of the 
series [disco.’ddifile), a series title [determs’.title), and an alternative series 
title [determs:alternative) is missing, an error message should be shown. 

• severity level: ERROR 

DISGO-G-GONDITIONAL-PROPERTIES-05: If the abstract [dc- 
terms:abstract) of a study [disco:Study), an external description of the study 
[disco.’ddifile), a study title [determs’.title), and an alternative study title 
[determs:alternative) is missing, an error message should be shown. 

R-51-HTML-HANDLING-OF-RDF-LITERALS 

R-71-CONDITIONAL-PROPERT1ES 



• severity level: ERROR 

- DISCO-C-CONDITIONAL-PROPERTIES-06: If a category statis¬ 
tics resource is connected with a code, it must be stated if the code is valid 
{discodsValid) and the code must be stated {skos:notation) 

• severity level: ERROR 

4.70 Recommended Properties 

Which properties are not necessarily required but recommended within a par¬ 
ticular contexlF^. 

- DATA-CUBE-C-RECOMMENDED-PROPERTIES-01: 

• severity level: INFO 

- DCAT-C-RECOMMENDED-PROPERTIES-01: 

• severity level: INFO 

- DISCO-C-RECOMMENDED-PROPERTIES-01: The property skos:notation 
is not mandatory for disco: Variables, but recommended to indicate variable 
names. 

• severity level: INFO 

- PHDD-C-RECOMMENDED-PROPERTIES-01: 

• severity level: INFO 

- SKOS-C-RECOMMENDED-PROPERTIES-01 : 

• severity level: INFO 

- XKOS-C-RECOMMENDED-PROPERTIES-01: 

• severity level: INFO 

4.71 Handle RDF Collections 

Examples of the Handle RDF Collections^ constraint are: a collection must have 
a specific size; the first/last element of a given list must be a specific literal; the 
elements of collections are compared; are collections identical?; actions on RDF 
listj^: the 2. list element must be equal to ’XXX’; does the list have more than 
10 elements? 

- DISCO-C-HANDLE-RDF-COLLECTIONS-01: Have comparable disco: Variables 
the same number of codes in their code lists? 

• severity level: INFO 

DISCO-C-HANDLE-RDF-COLLECTIONS-02: Does the actual num¬ 
ber of disco: Variables within an (un)ordered collection of a given disco:LogicalDataSet 
match the expected number? 

• severity level: INFO 

R-72-RECOMMENDED-PROPERTIES 

R-120-HANDLE-RDF-COLLECTIONS 

See http://www.snee.com/bobdc.blog/2014/04/rdf-lists-and-sparql.html 




4.72 Value is Valid for Datatype 


Make sure that a value is valid for its datatype. It has to be ensured, e.g., that 
a date is really a date, or that a xsdmonNegativeInteger value is not negative. 

- DISCO-C- VAL UE-IS- VALID-FOR-DATATYPE-01: Check if all lit¬ 
eral values of properties used within the Disco context of the datatype 
xsd:date (e.g. disco:startDate, disco:endDate, dcterms:date) are really of the 
datatype xsd:date. 

• severity level: ERROR 

- DISCO-C- VAL UE-IS- VALID-FOR-DATATYPE-02: Frequencies (disco:frequency) 
cannot be negative, i.e., must correspond to the XML Schema datatype 
xsdmonNegativeInteger. 

• severity level: ERROR 

DATA-CUBE-C-VALUE-IS-VALID-FOR-DATATYPE-01: Datatype 
consistency {IC-0 [5]) - The RDF graph must be consistent under RDF D- 
entailment using a datatype map containing all the datatypes used within 
the graph. 

• severity level: ERROR 

4.73 Use Sub-Super Relations in ValidatiorJ^ 

The validation of instances data (direct or indirect) exploits the sub-class or sub¬ 
property link in a given ontology. This validation can indicate when the data is 
verbose (redundant) or expressed at a too general level, and could be improved. 

If dcterms:date and one of its sub-properties determs:created or determs:issued 
are present, e.g., check that the value in dcterms:date is not redundant with 
determs .'Created or determs :issued for ingestion. 

DISCO-C-USE-SUB-SUPER-RELATIONS-IN-VALIDATION-01: 

If one or more dcteTins:coverage properties are present, suggest the use of 
one of its sub-properties determs:spatial or determs:temporal. 

• severity level: INFO 

- DISCO-C-USE-SUB-SUPER-RELATIONS-IN-VALIDATION-02: 

If the determs .'contributor property is present, suggest the use of one of its 
sub-properties, e.g. disco:fundedBy. 

• severity level: INFO 

4.74 Cardinality Shortcuts 

In most library applications, cardinality shortcuts tend to appear in pairs, with 
repeatable/non-repeatable establishing maximum cardinality and optional/mandatory 
establishing minimum cardinality. These are shortcuts for more detailed cardi¬ 
nality restrictions. 


R-224-USE-SUB-SUPER-RELATIONS-IN- VALIDATION 



4.75 Aggregations 

Some constraints require aggregating multiple values, especially via COUNT, 
MIN and MAX. 

— DISCO-C-AGGREGATION-01: calculate the number of theoretical con¬ 
cepts in the thematic classihcation of a given study. 

• severity level: INFO 

— DISGO-G-AGGREGATION-02: calculate the number of variables of a 
data set. 

• severity level: INFO 

— DISGO-G-AGGREGATION-03: calculate the number of questions in a 
given questionnaire. 

• severity level: INFO 

— DISGO-G-AGGREGATION-O 4 ■ the number of codes of a given variable 
must be below a maximum value. 

• severity level: INFO 

— DISGO-G-AGGREGATION-05: the number of questions of a given 
questionnaire must exactly be a given value. 

• severity level: INFO 

— DISGO-G-AGGREGATION-06-, the sum of percentages of all codes of 
a given variable must be 100. 

• severity level: INFO 

— DISGO-G-AGGREGATION-07-, the absolute frequency of all valid codes 
of a given variable must be equal to a given value. 

• severity level: INFO 


4.76 Provenance 

— DISGO-G-PROVENANOE-Ol: Series should have provenance informa¬ 
tion {dcterms-.provenance). 

• severity level: INFO 

— DISOO-O-PROVENANOE-02: Studies should have provenance infor¬ 
mation (dcterms.-provenance). 

• severity level: INFO 

— DISOO-O-PROVENANOE-03: Data sets should have provenance infor¬ 
mation {determs :provenance). 

• severity level: INFO 

— DISOO-O-PROVENANOE-04-- Data files should have provenance in¬ 
formation {dcterms:provenance). 

• severity level: INFO 



4.77 Comparison 


- DISCO-C-COMPARISON-VARIABLES-01. are compared variables 
represented in a compatible way, i.e. are the variables’ code lists theoretically 
comparable? 

• severity level: WARNING 

- DISCO-C-COMPARISON-VARIABLES-02: are variable definitions 
(determs:description) available for each variable (disco.-Variable) to com¬ 
pare? 

• severity level: ERROR 

- DISCO-C-COMPARISON- VARIABLES-03-. are code lists structured 
properly for each variable (disco-.Variable) to compare? 

• severity level: ERROR 

- DISCO-C-COMPARISON- VARIABLES-O 4 : is for each code (for each 
variable (disco: Variable) to compare) an associated category (a human-readable 
label) specified? 

• severity level: INFO 

- DISCO-C-COMPARISON-VARIABLES-05: each (disco:Variable) to 
compare must be present. 

• severity level: ERROR 


4.77 Data Model Consistency 

Is the data consistent with the intended semantics of the data model? Such 
validation rules ensure the integrity of the data according to the data model. 

- DISCO-C-DATA-MODEL-CONSISTENCY-01 : Codes (skos:Concept) 

are ordered and therefore have fixed positions in an ordered collection (skos:OrderedCollection) 
within a variable representation. The cumulative percentage of the current 
code is the cumulative percentage of the previous code (disco:cumulativePercentage) 
plus the percentage value (disco.-percentage) of the current code. 

• severity level: ERROR 

DISCO-C-DATA-MODEL-CONSISTENCY-02: The cumulative per¬ 
centage (disco-.cumulativePercentage) of the last code must be 100. 

• severity level: ERROR 

- DISCO-C-DATA-MODEL-CONSISTENCY-03-. The number of valid 
cases (disco: Summary Statistics of the type (disco:summaryStatisticType) ddiev- 
sumstats.-ValidCases) for a particular variable must exactly be the sum of 
all frequencies of all valid cases (disco:inValid of skos:Concept is true). 

• severity level: ERROR 

DISCO-C-DATA-MODEL-CONSISTENCY- 04 : The number of in¬ 
valid cases (disco: Summary Statistics of the type (disco: .summary StatisticType) 
ddicv-sumstats:InvalidCases) for a particular variable must exactly be the 
sum of all frequencies of all invalid cases (disco-.inValid of skos:Concept is 
false). 

• severity level: ERROR 



DISCO-C-DATA-MODEL-CONSISTENCY-05: The total number of 

cases {rdf:value of the disco'.SummaryStatistics resource of the type {disco:summaryStatisticType) 

ddicv-sumstats:NumberOfCases) for a particular variable must exactly be the 

number of valid cases plus the number of invalid cases. 

• severity level: ERROR 

DISCO-C-DATA-MODEL-CONSISTENCY-06. Some summary statis¬ 
tics types can only be calculated for given variable types. It is not possible 
to compute minimum values for string variables. 

• severity level: ERROR 

DISCO-C-DATA-MODEL-CONSISTENCY-07: Some summary statis¬ 
tics types can only be calculated for given variable types. It is not possible 
to compute mean values for categorical variables, only for metric variables. 

• severity level: ERROR 

DATA-CUBE-C-DATA-MODEL-CONSISTENCY-01: Only attributes 
may be optional {IC-6 [5]) - The only components of a qb:DataStructureDefinition 
that may be marked as optional, using qb: component Required are attributes. 

• severity level: WARNING 

DATA-CUBE-C-DATA-MODEL-CONSISTENCY-02: Slice Keys con¬ 
sistent with DSD {IC-8 [5]) - Every qb:componentProperty on a qb:SliceKey 
must also be declared as a qb:component of the associated qb:DataStructureDefinition. 

• severity level: ERROR 

DATA-CUBE-C-DATA-MODEL-CONSISTENCY-03: Slice dimen¬ 
sions complete {IC-10 [S]) - Every qb:Slice must have a value for every di¬ 
mension declared in its qb:sliceStructure. 

• severity level: ERROR 

DATA-CUBE-C-DATA-MODEL-CONSISTENCY-04: All dimensions 
required {IC-11 [2) - Every qb:Observation has a value for each dimension 
declared in its associated qb:DataStructureDefinition. 

• severity level: ERROR 

DATA-CUBE-C-DATA-MODEL-CONSISTENCY-05: No duplicate 
observations {IC-12 [S]) - No two qb:Observations in the same qb:DataSet 
may have the same value for all dimensions. 

• severity level: WARNING 

DATA-CUBE-C-DATA-MODEL-CONSISTENCY-06: Required at¬ 
tributes {IC-13 [S]) - Every qb:Observation has a value for each declared 
attribute that is marked as required. 

• severity level: ERROR 

DATA-CUBE-C-DATA-MODEL-CONSISTENCY-07: All measures 
present {IC-14 [5]) - In a qb:DataSet which does not use a Measure dimen¬ 
sion then each individual qb:Observation must have a value for every declared 
measure. 

• severity level: ERROR 

DATA-CUBE-C-DATA-MODEL-CONSISTENCY-08: Measure di¬ 
mension consistent {IC-15 [5]) - In a qb:DataSet which uses a Measure di¬ 
mension then each qb:Observation must have a value for the measure corre¬ 
sponding to its given qb:measureType. 


• severity level: ERROR 

DATA-CUBE-C-DATA-MODEL-CONSISTENCY-09: Single mea¬ 
sure on measure dimension observation {IC-16 [5]) - In a qbiDataSet which 
uses a Measure dimension then each qb:Observation must only have a value 
for one measure (by IC-15 this will be the measure corresponding to its 
qb:measureType). 

• severity level: ERROR 

- DATA-CUBE-C-DATA-MODEL-CONSISTENCY-10: All measures 
present in measures dimension cube {IC-17 [5]) - In a qb:DataSet which uses 
a Measure dimension then if there is a Observation for some combination 
of non-measure dimensions then there must be other Observations with the 
same non-measure dimension values for each of the declared measures. 

• severity level: ERROR 

- DATA-CUBE-C-DATA-MODEL-CONSISTENCY-11: Consistent data 
set links {IC-18 [5]) - If a qb:DataSet D has a qb:slice S, and S has an 
qb:observation 0, then the qb:dataSet corresponding to 0 must be D. 

• severity level: WARNING 

- SKOS-C-DATA-MODEL-CONSISTENCY-0^: Relation Clashes: Cov¬ 
ers condition S27 from the SKOS reference document, that has not been 
defined formally. 

• Implementation: In a first step, all pairs of concepts are found that are 
associatively connected, using a SPARQL query. In the second step, a 
graph is created, containing only hierarchically related concepts and the 
respective relations. Eor each concept pair from the first step, we check 
for a path in the graph from step two. If such a path is found, a clash 
has been identified and the causing concepts are returned. 

• Severity level: INEO 

- SKOS-C-DATA-MODEL-CONSISTENCY-0^: Mapping Clashes: Cov¬ 
ers condition S46 from the SKOS reference document, that has not been 
defined formally. 

• Implementation: Can be solved by issuing a SPARQL query. 

• Severity level: INEO 

- SKOS-C-DATA-MODEL-CONSISTENCY-0^: Mapping Relations 
Misuse: According to the SKOS reference documentation, mapping relations 
(e.g., skos:broadMatch or skos:relatedMatch) should be asserted to concepts 
being members of different concept schemes. This check finds concepts that 
are related by a mapping property and are either members of the same 
concept scheme or members of no concept scheme at all. 

• Severity level: INEO 

Corresponds to qSKOS Quality Issues - SKOS Semi-Formal Consistency Issues - 
Relation Clashes 

Corresponds to qSKOS Quality Issues - SKOS Semi-Formal Consistency Issues - 
Mapping Clashes 

Corresponds to qSKOS Quality Issues - SKOS Semi-Formal Consistency Issues - 
Mapping Relations Misuse 



4.78 Structure 


SKOS is based on RDF, which is a graph-based data model. Therefore we can 
concentrate on the vocabulary’s graph-based structure for assessing the quality 
of SKOS vocabularies and apply graph- and network-analysis techniques. 

— DISCO-C-STRUCTURE-01: there must be exactly one root in the hi¬ 
erarchy of DDI concepts. 

• severity level: ERROR 

— DATA-CUBE-C-STRUCTURE-01: Codes from hierarchy {IC-20 [S]) 

- If a dimension property has a qb:HierarchicalCodeList with a non-blank 
qb:parentChildProperty then the value of that dimension property on every 
qb: Observation must be reachable from a root of the hierarchy using zero or 
more hops along the qb .'parent ChildProperty links. 

• severity level: ERROR 

— DATA-CUBE-C-STRUCTURE-02: Codes bomhieia.rchy (mverse) {IC- 
21 0) - If a dimension property has a qb:HierarchicalCode List with an in¬ 
verse qb.'parentChildProperty then the value of that dimension property on 
every qb:Observation must be reachable from a root of the hierarchy using 
zero or more hops along the inverse qb.'parentChildProperty links. 

• severity level: ERROR 

— SKOS-C-STRUCTURE-0:^: Orphan Concepts: An orphan concept is 
a concept without any associative or hierarchical relations. It might have 
attached literals like e.g., labels, but is not connected to any other resource, 
lacking valuable context information. A controlled vocabulary that contains 
many orphan concepts is less usable for search and retrieval use cases, be¬ 
cause, e.g., no hierarchical query expansion can be performed on search terms 
to find documents with more general content. 

• Implementation: Iteration over all concepts in the vocabulary and re¬ 
turning that don’t have associated resources using (sub-properties of) 
skos'.semanticRelation. 

• Severity level: WARNING 

— SKOS-C-STRUCTURE-0^^: Disconnected Concept Clusters: Checking 
the connectivity of the graph, it is possible to identify all weakly connected 
components. These datasets form "islands” in the vocabulary and might be 
caused by incomplete data acquisition, "forgotten” test data, outdated terms 
and the like. 

• Implementation: Creation of an undirected graph that includes all non¬ 
orphan concepts as nodes and all semantic relations as edges. Tarjan’s 
algorithm then finds and returns all weakly connected components. 

• Severity level: INEO 

Corresponds to qSKOS Quality Issues - Structural Issues - Orphan Concepts 
Corresponds to qSKOS Quality Issues - Structural Issues - Disconnected Concept 
Clusters 



- SKOS-C-STRUCTURE-0^: Cyclic Hierarchical Relations: Although 

perfectly consistent with the SKOS data model, cyclic relations may reveal 
a logical problem in the thesaurus. Consider the following example: ’’deci¬ 
sion” —> ’’problem resolution” ^ ’’problem” ’’decision”: here the cycle is 
closed). The concepts are connected using skos:broader relationships (indi¬ 
cated with ” ). Due to the fact that a thesaurus is in many cases a product 

of consensus between the contributors (or just the decision of one dedicated 
thesaurus manager), it will be almost impossible to automatically resolve 
the cycle (i.e. deleting an edge). 

• Implementation: Construction of a graph having all concepts as nodes 
and the set of edges being skos:broader relations. 

• Severity level: WARNING 

— SKOS-C-STRUCTURE-0.^: Valueless Associative Relations: Two con¬ 
cepts are sibling, but also connected by an associative relation. In that con¬ 
text, the associative relation is not necessary. See ISO_DIS_25964-l, 11.3.2.2 

• Implementation: Identification of all pairs of concepts that have the same 
broader or narrower concepts, i.e. they are ’’sibling terms”. All siblings 
that are related by a skos:related property are returned. 

• Severity level: INFO 

— SKOS-C-STR. UCTTIRE- 0,¥^: Solely Transitively Related Concepts: skos:broaderTransitive 
and skosmarrowerTransitive are, according to the SKOS reference document, 

’’not used to make assertions”, so they should not be the only relations hi¬ 
erarchically relating two concepts. 

• Implementation: Identification of all concept pairs that are related by 
skos'.broaderTransitive or skos.'narrowerTransitive properties but not by 
their skos:broader and skos'.narrower subproperties. 

• Severity level: INFO 

- SKOS-C-STRUCTURE-0^^: Unidirectionally Related Concepts: Re¬ 
ciprocal relations (e.g., broader/narrower, related, hasTopConcept/topConceptOf) 
should be included in the controlled vocabularies to achieve better search re¬ 
sults using SPARQL in systems without reasoner support. 

• Implementation: This issue is checked without inference of owliinverseOf 
properties. We iterate over all triples and check for each property if an 
inverse property is defined in the SKOS ontology and if the respective 
statement using this property is included in the vocabulary. If not, the 
resources associated with this property are returned. 

• Severity level: INFO 

Corresponds to qSKOS Quality Issues - Structural Issues - Cyclic Hierarchical Re¬ 
lations 

Corresponds to qSKOS Quality Issues - Structural Issues - Valueless Associative 
Relations 

Corresponds to qSKOS Quality Issues - Structural Issues - Solely Transitively Re¬ 
lated Concepts 

Corresponds to qSKOS Quality Issues - Structural Issues - Unidirectionally Related 
Concepts 



— SKOS-C-STRUCTURE-0'¥^: Omitted Top Concepts: A vocabulary should 
provide ’’entry points” to the data to provide “efficient access” (SKOS primer) 
and guidance for human users. 

• Implementation: For every ConceptScheme in the controlled vocabulary, 
a SPARQL query is issued finding resources that are associated with 
this ConceptScheme by one of the properties skos:has Top Concept or 
skos:topConceptOj. Top concepts are also concepts having no broader 
concept. 

• Severity level: WARNING 

— SKOS-C-STRUCTURE-0^^: Top Concepts Having Broader Concepts: 
Concepts ’’internal to the tree” should not be indicated as top concepts. 

• Implementation: A SPARQL query finds all top concepts (being defined 
by one of the properties skos:hasTopConcept or skos:topConceptOf) hav¬ 
ing associated a broader concept. 

• Severity levef: ERROR 

— SKOS-C-STRUCTURE-0^^^: Hierarchical Redundancy: As stated in 
the SKOS reference document, skos:broader and skos:narrower are not tran¬ 
sitive properties. However, they are sub-properties of skos:broaderTransitive 
and skos'.narrowerTransitive which enables inference of a ’’transitive clo¬ 
sure” . This, in fact, leaves it up to the user to interpret whether a vocabu¬ 
lary’s hierarchical structure is seen as transitive or not. In the former case, 
this check can be useful. It finds pairs of concepts (A,B) that are directly 
hierarchically related but there also exits an hierarchical path through a 
concept C that connects A and B. 

• Severity level: INFO 

— SKOS-C-STRUCTURE-1^^^: Reflexive Relations: Concepts related to 
themselves. 

• Severity level: WARNING 


4.79 Labeling and Documentation 

DISCO-C-LABELING-AND-DOCUMENTATION-01: Series should 
be described {determs:description). 

• severity level: INFO 

DISCO-C-LABELING-AND-DOCUMENTATION-02: Studies should 
be described {determs:description). 

• severity level: INFO 

DISGO-G-LABELING-AND-DOGUMENTATION-03: Data sets should 
be described {determs:description). 

• severity level: INFO 

Corresponds to qSKOS Quality Issues - Structural Issues - Omitted Top Concepts 
Corresponds to qSKOS Quality Issues - Structural Issues - Top Concepts Having 
Broader Concepts 

Corresponds to qSKOS Quality Issues - Structural Issues - Hierarchical Redundancy 
Corresponds to qSKOS Quality Issues - Structural Issues - Reflexive Relations 



- DISCO-C-LABELING-AND-DOCUMENTATION-04-Batamesshon\d 

be described {determs:description). 

• severity level: INFO 

DISCO-C-LABELING-AND-DOCUMENTATION-05: Instruments 
should be described (dcterms:description). 

• severity level: INFO 

- DISGO-G-LABELING-AND-DOGUMENTATION-06: Variables should 
be described {determs:description). 

• severity level: INFO 


- SKOS-G-LABELING-AND-DOGUMENTATION-Oi 


102 


Undocumented 


Concepts: The SKOS standard defines a number of properties useful for 
documenting the meaning of the concepts in a thesaurus also in a human- 
readable form. Intense use of these properties leads to a well-documented 
thesaurus which should also improve its quality. 

• Implementation: Iteration over all concepts in the vocabulary and find 

those not using one of skos:note, skos:changeNote, skos:defLnition, skos:editorialNote, 
skos:example, skos:historyNote, or skos:scopeNote. 

• Severity level: INFO 

- SKOS- G-LABELING-AND-D O G UMENTA TION- 0#^; Overlapping 
Labels: This is a generalization of a recommendation in the SKOS primer, 
that “no two concepts have the same preferred lexical label in a given lan¬ 
guage when they belong to the same concept scheme”. This could indicate 
missing disambiguation information and thus lead to problems in autocom¬ 
pletion application. 

• Severity level: INFO 


- SKOS-G-LABELING-AND-DOGUMENTATION-Ol 


?104 


; Missing La¬ 


bels: To make the vocabulary more convenient for humans to use, instances 
of SKOS classes (Concept, ConceptScheme, Collection) should be labeled 
using e.g., skos:prefLabel, altLabel, rdfs:label, dc.title. 

• Severity level: INFO 

SKOS-G-LABELING-AND-DOGUMENTATION-0^: Unprintable 
Characters in Labels: pref/alt/hiddenlabels contain characters that are not 
alphanumeric characters or blanks. 

• Severity level: INFO 

- SKOS- G-LABELING-AND-D O G UMENTA TION- 0^: Empty La¬ 
bels: Labels also need to contain textual information to be useful, thus we 
find all SKOS labels with length 0 (after removing whitespaces). 


Corresponds to qSKOS Quality Issues - Labeling and Documentation Issues - Un¬ 
documented Concepts 

Corresponds to qSKOS Quality Issues - Labeling and Documentation Issues - Over¬ 
lapping Labels 

Corresponds to qSKOS Quality Issues - Labeling and Documentation Issues - Missing 
Labels 

Corresponds to qSKOS Quality Issues - Labeling and Documentation Issues - Un¬ 
printable Characters in Labels 

Corresponds to qSKOS Quality Issues - Labeling and Documentation Issues - Empty 
Labels 





• Severity level: INFO 

- SKOS-C-LABELING-AND-DOCUMENTATION- 0^: Ambiguous 
Notation References: Concepts within the same concept scheme should not 
have identical skos:notation literals. 

• Severity level: INFO 


4.80 Vocabulary 

Vocabularies should not invent any new terms or use deprecated elements. 


- DATA-CUBE-C-VOCABULARY-01 

• Severity level: ERROR 

- DCAT-C-VOCABULARY-01 

• Severity level: ERROR 

- DISCO-C-VOCABULARY-01 

• Severity level: ERROR 


- PHDD-C-VOCABULARY-01 

• Severity level: ERROR 


SKOS-C-VOCABULARY-Oa Undefined SKOS Resources: The vo¬ 
cabulary should not invent any new terms within the SKOS namespace or 
use deprecated SKOS elements. 

• Severity level: ERROR 


XKOS-C-VOCABULARY-01 

• Severity level: ERROR 


4.81 HTTP URI Scheme Violation 

DISCO-C-HTTP- URI-SCHEME- VIOLA TIOI^: In the context of 
Linked Data, we restrict ourselves to using HTTP URIs only and avoid other 
URI schemes such as URNs and DOIs. 

• Severity level: ERROR 

- DATA-CUBE-C-HTTP-URI-SCHEME- VIOLATIOT^ In the con¬ 
text of Linked Data, we restrict ourselves to using HTTP URIs only and 
avoid other URI schemes such as URNs and DOIs. 

Corresponds to qSKOS Quality Issues - Labeling and Documentation Issues - Am¬ 
biguous Notation References 

Corresponds to qSKOS Quality Issues - Linked Data Specific Issues - Undefined 
SKOS Resources 

Corresponds to qSKOS Quality Issues - Linked Data Specific Issues - HTTP URI 
Scheme Violation 

Corresponds to qSKOS Quality Issues - Linked Data Specific Issues - HTTP URI 
Scheme Violation 




• Severity level: ERROR 

- PHDD-C-HTTP- URI-SCHEME- VIOLA TlOlf^: In the context of 
Linked Data, we restrict ourselves to using HTTP URIs only and avoid 
other URI schemes such as URNs and DOIs. 

• Severity level: ERROR 

- SKOS- C-HTTP- URI-SCHEME- VIOLA TIOl¥^. In the context of 
Linked Data, we restrict ourselves to using HTTP URIs only and avoid 
other URI schemes such as URNs and DOIs. 

• Severity level: ERROR 

5 Conclusion 

We identified and published by today 81 types of constraints that are required 
by various stakeholders for data applications. In close collaboration with several 
domain experts for the social, behavioral, and economic sciences (SHE), we for¬ 
mulated constraints on common SHE vocabularies and classified them according 
to their severity level. 
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